# Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing

### Stuart Russell and Eric Wefald

###
EECS Department

University of California, Berkeley

Technical Report No. UCB/CSD-88-435

October 1988

### http://www.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-435.pdf

In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.

BibTeX citation:

@techreport{Russell:CSD-88-435, Author = {Russell, Stuart and Wefald, Eric}, Title = {Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing}, Institution = {EECS Department, University of California, Berkeley}, Year = {1988}, Month = {Oct}, URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html}, Number = {UCB/CSD-88-435}, Abstract = {In this paper we outline a general approach to the study of problem-solving, in which search steps are considered decisions in the same sense as actions in the world. Unlike other metrics in the literature, the value of a search step is defined as a real utility rather than as a quasi-utility, and can therefore be computed directly from a model of the base-level problem-solver. We develop a formula for the value of a search step in a game-playing context using the single-step assumption, namely that a computation step can be evaluated as it was the last to be taken. We prove some meta-level theorems that enable the development of a low-overhead algorithm, MGSS*, that chooses search steps in order of highest estimated utility. Although we show that the single-step assumption is untenable in general, a program implemented for the game of Othello appears to rival an alpha-beta search with equal node allocations or time allocations. Pruning and search termination subsumes or improve on many other algorithms. Single-agent search, as in the A* algorithm, yields a simpler analysis, and we are currently investigating applications of the algorithm developed for this case.} }

EndNote citation:

%0 Report %A Russell, Stuart %A Wefald, Eric %T Decision-Theoretic Control of Reasoning: General Theory and an Application to Game-Playing %I EECS Department, University of California, Berkeley %D 1988 %@ UCB/CSD-88-435 %U http://www.eecs.berkeley.edu/Pubs/TechRpts/1988/5755.html %F Russell:CSD-88-435