# Abstracts for Lotfi A. Zadeh

## A New Direction in Decision Analysis

(ARO) DAAH 04-961-0341, BISC, (NASA) NCC2-1006, (NASA) NAC2-117, (ONR) N00014-00-1-0621, (ONR) N00014-99-C-0298, (ONR) N00014-96-1-0556, and (ONR) FDN0014991035

Decisions are based on information. More often than not, decision-relevant information is a mixture of measurements and perceptions. For example, in deciding on whether or not to buy a house, what might be called measurement-based information pertains to the price of the house, the taxes, the mortgage rate, age, area, size of the lot, etc., while the perception-based information relates to the appearance of the house, the quality of construction, security, access to public transportation, etc.

Humans have a remarkable capability to make rational decisions based only on perceptions of values of decision variables, without any measurements and any computations. It is this capability that makes it possible for humans to drive in heavy traffic, play golf, park a car, and summarize a story. And it is this capability that the existing methods of decision analysis do not possess.

The incapability of existing methods of decision analysis to operate on perception-based information has a serious implication. More specifically, in most realistic settings, decision-relevant information includes probabilities of possible outcomes and future events. For example, in buying a house, an important consideration might be the probability that prices of houses will be rising in the months ahead. In general, decision-relevant probabilities are subjective and, in essence, are perceptions of likelihood. As perceptions, such probabilities are intrinsically imprecise. Computation with perception-based probabilities is an essential capability which the existing methods of decision analysis do not have.

As an illustration, consider a simple trip-planning problem, I have to fly from city A to city B and need to get there as soon as possible. There are two alternatives: (a) fly to D via B; and (b) fly to D via C. (a) will bring me to D earlier than (b). However, the connection time at B is short, and there is a likelihood that I may miss it, in which case I will have to take the next flight and will arrive in D much later. Neither I nor anyone else knows the numerical value of the probability that I may miss the connection. With this in mind, should I choose (a) or (b)? Decision analysis does not provide an answer to this question.

In perception-based decision analysis, or PDA for short, perception-based information is dealt with through the use of the computational theory of perceptions (CTP). In CTP, the point of departure is not a perception per se, but its description in a natural language. Thus, a perception is equated to its description. In this way, computation with perceptions is reduced to computation with propositions drawn from a natural language.

Reflecting the bounded ability of human sensory organs and, ultimately, the brain, to resolve detail and store information, perceptions are intrinsically imprecise. More specifically, perceptions are f-granular in the sense that (a) the boundaries of perceived classes are fuzzy; and (b) the values of perceived attributes are granulated, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity or functionality. For example, the values of age as a granulated variable might be: young, middle-aged, old, very old, etc. Similarly, the values of probability might be: likely, not very likely, unlikely, etc.

F-granularity of perceptions puts them well beyond the expressive power of existing meaning-representation languages. In CTP, what is employed for meaning representation is precisiated natural language (PNL). A key idea in PNL is that the meaning of a proposition, p, drawn from a natural language, NL, may be represented as a generalized constraint of the form X isr R, where X is the constrained variable that is implicit in p; R is the implicit constraining relation; and r is an indexing variable which identifies the way in which R constrains X. The principal types of constraints are: possibilistic (r=blank); veristic (r=v); probabilistic (r=p); random set (r=rs); fuzzy graph (r=fg); and usuality (r=u). The constraint X isr R is referred to as a generalized constraint form, GC-form, and plays the role of a precisiation of p, p*. A proposition is precisiable if it has a GC-form. The language of GC-forms and their combinations is referred to as the Generalized Constraint Language, GCL. In relation to NL, GCL plays the role of a precisiation language.

Essentially, PNL is a sublanguage of NL which consists of propositions which are precisiable through translation from NL to GCL. More specifically, PNL is equipped with (a) a dictionary from NL to GCL in which the entries are (p, p*); (b) a dictionary from GCL to protoformal language (PFL), in which p* is paired with its abstracted form, e.g., QA's are B's, where p is: most Swedes are tall; and (c) a collection of protoformal rules of deduction which serves as a system for reasoning with perceptions through the use of goal-directed propagation of generalized constraints.

In PNL, perception-based probability distributions are represented as f-granular probability distributions expressed symbolically as P1 \ A1 + --- + Pn \ An, where the Ai are f-granular values of a random variable, X, and the Pi are f-granular values of their respective probabilities, that is, Pi=Prob{X is Ai}. For example, assume that I have to decide on whether to take an umbrella, given my perception that the f-granular probability distribution of rain is: P= low\no rain+low\light rain+high\moderate rain+low\heavy rain. A decision rule, then, may be: Take an umbrella if the perception of the probability distribution of rain is P. An important part of PDA is a collection of rules for computing with and ranking of f-granular probability distributions.

In spirit as well as in substance, perception-based decision analysis represents a radical departure from existing measurement-based methods. Basically, what PDA adds to decision analysis is a collection of concepts and techniques for basing decisions on perception-based information, with the understanding that measurement-based information is a special case.

A concomitant of progression from measurement-based to perception-based decision analysis is a shift from universal, normative decision principles to customized decision rules which are application-and context-dependent. This is an unavoidable consequence of striving for a rapprochment with the reality of making decisions in an environment of pervasive uncertainty, imprecision and partial truth.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division, Department of EECS, University of California, Berkeley, CA 94720-l776; Telephone: 5l0-642-4959; Fax: 5l0-642-l7l2; E-mail: zadeh@cs.berkeley.edu. Research supported in part by ONR N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## A Prototype-Centered Approach to Adding Deduction Capability to Search Engines--The Concept of Protoform

Existing search engines have many remarkable capabilities. But what is not among them is deduction capability--the capability to answer a query by drawing on information which resides in various parts of the knowledge base or is augmented by the user. An example of a simple query which cannot be answered by any existing search engine is: "How many UC Berkeley alumni were born in California?" In this case, the query and the query-relevant information are crisp. In the query "How many UC Berkeley alumni are wealthy?" the query-relevant information is crisp but the query is fuzzy. The problem--which is not widely recognized--is that much of the information in the knowledge base of a search engine is perception-based. Methods based on bivalent logic and standard probability theory lack capability to operate on perception-based information. A search engine with deduction capability is, in effect, a question-answering system.

Limited progress toward a realization of deduction capability is achievable through application of methods based on bivalent logic and standard probability theory. But to move beyond the reach of standard methods it is necessary to change direction. In the approach which is outlined, a concept which plays a pivotal role is that of a prototype--a concept which has a position of centrality in human reasoning, recognition, search and decision processes.

Informally, a prototype may be defined as a sigma-summary, that is, a summary of summaries. With this definition as the point of departure, a prototypical form, or protoform, for short, is defined as an abstracted prototype. As a simple example, the protoform of the proposition "Most Swedes are tall" is "QA's are B's," where Q is a fuzzy quantifier, and A and B are labels of fuzzy sets. Similarly, the protoform of "Usually Robert returns from work at about 6 pm," is "Prob (A) is B," where A is a fuzzy event and B is a fuzzy probability.

Abstraction has levels, just as summarization does. For example, in the case of "Most Swedes are tall," successive abstracted forms are "Most A's are tall," "Most A's are B's," and "QA's are B's."

At a specified level of abstraction, propositions are PF-equivalent if they have identical protoforms. For example, propositions "Usually Robert returns from work at about 6 pm" and "In winter, the average daily temperature in Berkeley is usually about fifteen degrees centigrade," are PF-equivalent. The importance of the concepts of protoform and PF-equivalence derives in large measure from the fact that they serve as a basis for knowledge compression.

A knowledge base is assumed to consist of a factual database, FDB, and a deduction database, DDB. In both FDB and DDB, knowledge is assumed to fall into two categories: (a) crisp and (b) fuzzy. Examples of crisp items of knowledge in FDB might be: “Height of the Eiffel tower is 324m” and “Paris is the capital of France.” Examples of fuzzy items might be “Most Swedes are tall,” and “California has a temperate climate.” Similarly, in DDB, an example of a crisp rule might be “If A and B are crisp convex sets, then their intersection is a crisp convex set.” An example of a fuzzy rule might be “If A and B are fuzzy convex sets, then their intersection is a fuzzy convex set.” A fuzzy rule may be a crisp assertion about fuzzy sets or a fuzzy assertion about crisp sets or a fuzzy assertion about fuzzy sets.

The deduction database is assumed to consist of a logical database and a computational database, with the rules of deduction having the structure of protoforms. An example of a computational rule is "If Q A's are B's and Q (A and B)'s are C's," then "Q Q A's are (B and C)'s,” where Q and Q are fuzzy quantifiers, and A, B and C are labels of fuzzy sets. The number of rules in the computational database is assumed to be very large in order to allow a chaining of rules that may be query-relevant.

A very simple example of deduction in the prototype-centered approach—an example which involves string matching but no chaining--is the following. Suppose that a query is “How many Swedes are very tall?” A protoform of this query is: ?Q A’s are 2B, where Q is a fuzzy quantifier and 2B is assumed to represent the meaning of “very B,” with the membership function of 2B being the square of the membership function of B. Searching DDB, we find the rule “If Q A’s are B then Q1/2 A’s are 2B,” whose consequent matches the query, with ?Q instantiated to Q1/2, A to “Swedes,” and B to “tall.” Furthermore, in FDB, we find the fact “Most Swedes are tall,” which matches the antecedent of the rule, with Q instantiated to “Most.” A to “Swedes” and B to “tall.” Consequently, the answer to the query is “Most1/2 Swedes are very tall,” where the membership function of “Most1/2” is the square root of Most in fuzzy arithmetic.

Another example of deduction is a simple version of what is known as the Robert example. Specifically, from the perception, p. “Usually Robert returns from work at about 6:00pm,” what is to be deduced is the answer to the question, q, “What is the probability that Robert is home at about t pm,” where t is a specified time.

In this case, the protoform of p is: Prob(A) is B, where A is an abstraction of the fuzzy event “Robert returns from work at about 6:00pm,” and B is an abstraction of the fuzzy probability “usually.” Similarly, the protoform of q is: Prob(C) is ?D, where C is an abstraction of the fuzzy event “Robert returns from work before about t pm,” and D is an abstraction of the probability which is the object of deduction.

An applicable deduction rule in DDB is

Prob(A) is B
____________
Prob(C) is D

in which D is the solution of a variational problem involving the membership function of A, B, C and D. The details of solution may be found in a forthcoming paper “Toward a Perception-Based Theory of Probabilistic Reasoning with Imprecise Probabilities,” to appear in the Journal of Statistical Planning and Inference.

The rules of deduction in the prototype-centered approach involve generalized constraint propagation, with generalized constraints represented as protoforms. Deduction is carried out through backtracking from query to query-relevant information. The concept of protoform has some links to the concept of ontology in ontological engineering, but unlike the latter is rooted in fuzzy logic and perception-based probability theory.

What should be underscored is that the problem of adding deduction capability to search engines is many-faceted and complex. It would be unrealistic to expect rapid progress toward its solution.

* Professor in the Graduate School and director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, University of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu . Research supported in part by ONR N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Hierarchical Recurrent Neuro-Fuzzy Systems

Andreas Nürnberger1
BTexact Technologies, UK and Berkeley Initiative in Soft Computing (BISC)

Fuzzy systems, neural networks, and their combination in neuro-fuzzy systems are already well established in data analysis and system control. Neuro-fuzzy systems are especially well suited for the development of interactive data analysis tools, which enable the creation of rule-based knowledge from data and the introduction of a priori knowledge into the process of data analysis. However, its recurrent variants, especially recurrent neuro-fuzzy models, are still rarely used.

To simplify the definition of fuzzy systems, or to reduce its complexity, hierarchical and recurrent structures can be used. Thus, more transparent rule bases that are also easier to maintain can be designed. These structures also allow the use of time delayed input or reuse of time delayed output from the fuzzy system itself. Thus, we obtain a rule base that is able to describe dynamic behavior.

In this project we study the capabilities of hierarchical recurrent neuro-fuzzy models. Furthermore, we have developed a neuro-fuzzy model that can be used to learn and optimize hierarchical recurrent fuzzy rule bases from data.

[1]
A. Nürnberger and R. Kruse, "A Neuro-Fuzzy Approach to Optimize Hierarchical Recurrent Fuzzy Systems," Fuzzy Optimization and Decision Making, Vol. 1, No. 2, 2002.
[2]
A. Nürnberger, "Using Hierarchical Recurrent Neuro-Fuzzy Systems for Decision Making," Information Refinement and Revision for Decision Making, AAAI Spring Symp., Technical Report SS-02-03, AAAI Press, Menlo Park, CA, 2002.
1Postdoctoral Researcher

Send mail to the author : (anuernb@eecs.berkeley.edu)

## Soft Computing Techniques for Information Retrieval

Andreas Nürnberger1
BTexact Technologies, UK and Berkeley Initiative in Soft Computing (BISC)

In this project, we study approaches that combine visualization techniques for document collections with conventional keyword and content-based search methods. During the past period we focused on the development of interactive visualization techniques.

For the visualization of document collections and search results we developed models based on growing self-organizing maps. These maps can be used to arrange documents based on the similarity of the document's content [1]. The trained and labeled map can then be used to visualize the structure of the underlying document collection as well as changes in the collection, e.g., insertion or removal of document subsets [2]. Furthermore, visual information about document density, match of search hits to specific document groups, and similarity to a given sample document in content based searching can be given by different coloring methods. Besides for the analysis of text document collections, the developed models can also be applied for the analysis of multimedia data [3] or to post-process search engine result sets. The visualization and clustering of the obtained results provides additional visual information about the outcome of a search, which is more intuitive than a pure ordered list of search hits.

A further advantage of the developed approaches is that manually defined lists of index terms or a classification hierarchy, which are usually subjectively labeled and require expensive maintenance, are not needed. Especially, in rapidly changing document collections, such as collections of scientific research publications, classification systems that are not frequently updated and do not reflect the user’s classification criteria, are usually not accepted.

[1]
A. Nürnberger, A. Klose, and R. Kruse, "Self-Organising Maps for Interactive Search in Document Databases," Intelligent Exploration of the Web, ed. P. S. Szczepaniak, J. Segovia, J. Kacprzyk, and L. A. Zadeh, Physica-Verlag, 2002.
[2]
A. Nürnberger and M. Detyniecki, "Visualizing Changes in Data Collections Using Growing Self-Organizing Maps," Proc. Int. Joint Conf. Neural Networks, Honolulu, HI, May 2002.
[3]
A. Nürnberger and A. Klose, "Improving Clustering and Visualization of Multimedia Data Using Interactive User Feedback," Proc. Int. Conf. Information Processing and Management of Uncertainty in Knowledge-Based Systems, Annecy, France, July 2002.
1Postdoctoral Researcher

Send mail to the author : (anuernb@eecs.berkeley.edu)

## Computing with Words and Perceptions--A Paradigm Shift in Computing and Decision Analysis

Basically, in computing with words and perceptions, or CWP for short, the objects of computation are words, propositions and perceptions described in a natural language. A very simple example is: Usually Robert returns from work at about 6 pm. What is the probability that Robert is home at about 6:l5 pm? Another elementary example is: A box contains about 20 balls of various sizes. Most are large. There are many more large balls than small balls. How many are neither large nor small?

In science, there is a deep-seated tradition of striving for progression from perceptions to measurements, and from the use of words to the use of numbers. Why and when, then, should the use of CWP be considered?

There are two principal rationales. The first, briefly stated, is: When precision is desired but the needed information is not available, in which case the use of CWP is a necessity rather than an option. And second, when precision is not needed, in which case the tolerance for imprecision may be exploited to achieve tractability, robustness, simplicity and low solution cost.

Another important point is that humans have a remarkable capability to perform a wide variety of physical and mental tasks without any measurements and any computations, e.g., parking a car, driving in city traffic, playing golf and summarizing a book. In performing such tasks, humans employ perception--rather than measurements--of distance, direction, speed, count, likelihood, intent and other attributes.

Reflecting the bounded ability of sensory organs and, ultimately, the brain, to resolve detail, perceptions are intrinsically imprecise. More concretely, perceptions are f-granular in the sense that (a) the perceived values of attributes are fuzzy; and (b) the perceived values of attributes are granular, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity or functionality.

F-granularity of perceptions is the reason why in the enormous literature on perceptions—in fields ranging from linguistics and logic to psychology and neuroscience—one cannot find a theory in which perceptions are objects of computation, as they are in CWP. What should be stressed is that in dealing with perceptions, the point of departure in CWP is not a perception per se, but its description in a natural language. This is a key idea which makes it possible to reduce computation with perceptions to computation with propositions drawn from a natural language.

A related key idea in CWP is that the meaning of a proposition, p, in a natural language may be represented as a generalized constraint of the form X isr R, where X is a constrained variable which, in general, is implicit in p; R is the constraining relation which, like X,is in general implicit in p; and r is an indexing variable whose value identifies the way in which R constrains X. The principal types of constraints are: equality (r is =); possibilistic (r is blank); veristic (r is v); probabilistic (r is p); random set (r is rs); fuzzy graph (r is fg); usuality (r is u); and Pawlak set (r is ps). In this system of classification of constraints, the standard constraint, X belongs to C, where C is a crisp set, is possibilistic. Representation of p as a generalized constraint is the point of departure in what is called precisiated natural language (PNL).

PNL associates with a natural language, NL, a precisiation language, GCL (Generalized Constraint Language), which consists of generalized constraints and their combinations and qualifications. A simple example of an element of GCL is: (X is A) and (Y isu B). A proposition, p, in NL is precisiable if it is translatable into GCL. In effect, PNL is a sublanguage of NL which consists of propositions which are precisiable through translation into GCL. More concretely, PNL is associated with two dictionaries (a) from NL to GCL, and (b) from GCL to what is referred to as the protoform language (PFL). An element of PFL is an abstracted version of an element of GCL. The translates of p into GCL and PFL are denoted as GC(p) and PF(p ), respectively.

In addition, PNL is associated with a deduction database, DDB, which consists of rules of deduction expressed in PFL. An example of such a rule is the intersection/product syllogism: if Q A's are B's and R (A and B)'s are C's, then QR A's are (B and C)'s, where Q and R are fuzzy quantifiers, e.g., most, many, few; A, B and C are fuzzy sets, and QR is the product of Q and R in fuzzy arithmetic.

The principal function of PNL is to serve as a system for computation and reasoning with perceptions. A related function is that of serving as a definition language. In this capacity, PNL may be used to (a) define new concepts, e.g., the usual value of a random variable; and (b) redefine existing concepts, e.g., the concept of statistical independence. The need for redefinition arises because standard bivalent -classic-based definitions may lead to counterintuitive conclusions.

Computing with words and perceptions provides a basis for an important generalization of probability theory, namely, perception-based probability theory (PTp). The point of departure in PTp is the assumption that subjective probabilities are, basically, perceptions of likelihood. A key consequence of this assumption is that subjective probabilities are f-granular rather than numerical, as they are assumed to be in standard bivalent-logic-based probability theory, PT.

A basic concept in PTp is that of f-granular probability distribution, P*(X), of a random variable, X, with X taking values in a space U. Symbolically, P*(X) is expressed as P*(X)= P1\A1+…+Pn\An, in which the Ai are granules of X, and Pi is a perception of probability of the event X is Ai. For example, if X is the length of an object and its granules are labeled short, medium and long, then P*(X) may be expressed as P*(X) = low \ short+high \ medium+low \ long, where low and high are granular probabilities. Basically, PTp adds to PT the capability to operate on perception-based information—a capability which plays an especially important role in decision analysis. More specifically, in most realistic settings in decision analysis, a decision involves a ranking of f-granular probability distributions. Furthermore, employment of the principle of maximization of expected utility leads to a ranking of fuzzy numbers.

In the final analysis, the importance of CWP derives from the fact that it opens the door to adding to any measurement-based theory, T, the capability to operate on perception-based information. Conceptually, computationally and mathematically, the perception-based theory, Tp, is significantly more complex than T. In this instance, as in many others, complexity is the price that has to be paid to reduce the gap between theory and reality.

* Professor in the Graduate School and Director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, University of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu . Research supported in part by ONR N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Crossing into Uncharted Territory—The Concept of Approximate X

In science—and especially in mathematics—it is a universal practice to express definitions in a language based on bivalent logic. Thus, if C is a concept, then under its definition every object, u, is either an instance of C or it is not, with no shades of gray allowed. This deep-seated tradition—which is rooted in the principle of the excluded middle—is in conflict with reality. Furthermore, it rules out the possibility of graceful degradation, leading to counterintuitive conclusions in the spirit of the ancient Greek sorites paradox.

In fuzzy logic—in contrast to bivalent logic—everything is, or is allowed to be, a matter of degree. This is well known, but what is new is the possibility of employing the recently developed fuzzy-logic-based language precisiated natural language (PNL) as a concept-definition language to formulate definitions of concepts of the form “approximate X,” where X is a crisply defined bivalent-logic-based concept. For example, if X is the concept of a linear system, then “approximate X” would be a system that is approximately linear.

The machinery of PNL provides a basis for a far-reaching project aimed at associating with every—or almost every—crisply defined concept X a PNL-based definition of “approximate X,” with the understanding that “approximate X” is a fuzzy concept in the sense that every object x is associated with the degree to which x fits X, with the degree taking values in the unit interval or a partially ordered set. A crisp definition of “approximate X” is not acceptable because it would have the same problems as the crisp definition of X.

As a simple example, consider the concept of a linear system. Under the usual definition of linearity, no physical system is linear. On the other hand, every physical system may be viewed as being approximately linear to a degree. The question is: How can the degree be defined?

More concretely, assume that I want to get a linear amplifier, A, and that the deviation from linearity of A is described by the total harmonic distortion, h, as a function of power output, P. For a given h(P), then, the degree of linearity may be defined in the language of fuzzy if-then rules – a language which is a sublanguage of PNL. In effect, such a definition would associate with h(P) its grade of membership in the fuzzy set of distortion/power functions which are acceptable for my purposes. What is important to note is that the definition would be local, or, equivalently, context-dependent, in the sense of being tied to a particular application. What we see is that the standard, crisp, definition of linearity is global (universal, context-independent, objective), whereas the definition of approximate linearity is local (context-dependent, subjective). This is a basic difference between a crisp definition of X and PNL-based definition of “approximate X.” In effect, the loss of universality is the price which has to be paid to define a concept, C, in a way that enhances its rapport with reality.

In principle, with every crisply defined X we can associate a PNL-based definition of “approximate X.” Among the basic concepts for which this can be done are the concepts of stability, optimality, stationarity and statistical independence. But a really intriguing possibility is to formulate a PNL-based definition of “approximate theorem.” It is conceivable that in many realistic settings informative assertions about “approximate X” would of necessity have the form of “approximate theorems,” rather than theorems in the usual sense. This is one of the many basic issues which arise when we cross into the uncharted territory of approximate concepts defined via PNL.

A simple example of “approximate theorem” is an approximate version of Fermat’s theorem. More specifically, assume that the equality xn + yn = zn is replaced with approximate equality. Furthermore, assume that x, y, z are restricted to lie in the interval [I,N]. For a given n, the error, e(n), is defined as the minimum of a normalized value of | xn + yn - zn | over all allowable values of x, y, z. Observing the sequence {e(n)}, n = 3,4,…, we may form perceptions described as, say, “for almost all n the error is small;” or “the average error is small;” or whatever appears to have a high degree of truth. Such perceptions, which in effect are summaries of the behavior of e(n) as a function of n, may qualify to be called “approximate Fermat’s theorems.” It should be noted that in number theory there is a sizeable literature on approximate Diophantine equations. There are many deep theorems in this literature, all of which are theorems in the usual sense.

In a sense, an approximate theorem may be viewed as a description of a perception. The concept of a fuzzy theorem was mentioned in my 1975 paper “The Concept of a Linguistic Variable and its Application to Approximate Reasoning.” What was missing at the time was the concept of PNL.

* Professor in the Graduate School and director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, Univeristy of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Grant N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## In Quest of Performance Metrics for Intelligent Systems—A Challenge that Cannot be Met with Existing Methods

As we move further into the realm of intelligent systems, the problem of devising performance metrics for assessing machine intelligence looms larger and larger in importance. The problem is there, but does it have a solution? A somewhat unorthodox view which is articulated in the following is that (a) complete solution is beyond the reach of existing methods; and (b) that a prerequisite to solving the problem is a better understanding of a broader problem, namely, the basic problem of concept definability. To this end, what is presented in the following is a sketch of what may be called a theory of hierarchical definability, or THD for short.

In science, and especially in natural sciences and mathematics, there is a long-standing tradition of expecting that concepts be defined clearly and precisely. But as we move from the natural sciences to the sciences of the artificial, two basic problems came into view.

The first problem relates to the need to formulate our definitions in ways that can be understood by a machine. For example, if I command a household robot to take the dishes off the table, I must define what I mean by “take the dishes off the table.” Or, if I instruct a machine to summarize a document, I must define what I mean by a summary. And, how can I assess the machine IQ (MIQ) of a machine that executes my commands?

The second problem is that we encounter, much more frequently than in the past, concepts which do not lend themselves to precise definition. Among familiar examples of such concepts are intelligence, creativity, autonomy, adaptivity, relevance, robustness, and causality.

We have been largely unsuccessful in formulating operational definitions of concepts of this nature. Why?

A view that is advanced in the following is that the primary reason for the lack of success is that the concepts in question, and many like them, are intrinsically fuzzy, that is, are a matter of degree. Thus, when we try to define such concepts within the conceptual framework of classical, bivalent logic, we encounter a fundamental incompatibility—an incompatibility between crispness of definitions and fuzziness of the concepts we try to define.

Viewed in a slightly different perspective, the problem relates to the inadequate expressive power of the definition languages which are at our disposal, namely, the natural language and the language of bivalent logic. What this implies is that, to solve the problem, we have to add languages with higher expressive power to our repertoire of definition languages. This is the basic idea that underlies the theory of hierarchical definability, THD.

In THD, the languages that we add are based on fuzzy logic since they must be capable of serving as definition languages for fuzzy concepts. More specifically, the definition languages in THD form a hierarchy represented as (NL, C, F, F.G, PNL), where NL is the lowest member in terms of expressive power, and precisiated natural language (PNL) is the highest. It is understood that every member of the hierarchy subsumes those below it.

The C definition language is the language of mathematical analysis, probability theory, and bivalent logic. This is the language that we learn when we take courses in mathematics, probability theory and logic. The F language is the language of fuzzy logic without granulation, and the F.G language is the language of fuzzy logic with granulation. PNL is a fuzzy-logic-based language with maximal expressive power.

A simple analogy may be of help. In my progression of learning, I start with my knowledge of a natural language. After entering a university and taking courses in mathematics, I add to NL my knowledge of C. At this stage, I can use the union of NL and C as a definition language. Then, I take a course in fuzzy logic. In this course, first I learn F, then F.G and finally PNL. At the end, I can use PNL as a definition language, with the understanding that PNL subsumes all languages below it in the hierarchy.

What is PNL? The basic idea in PNL is that a proposition, p, in a natural language, NL, may be precisiated through translation into a precisiation language. In the case of PNL, the precisiation language is the generalized constraint language (GCL). A generic generalized constraint is represented as Z isr R, where Z is the constrained variable, R is the constraining relation and r is a discrete-valued indexing variable whose values define the ways in which R constrains Z. The principal types of constraints are: possibilistic (r=blank); veristic (r=v); probabilistic (r=p); random set (r=rs); usuality (r=u); fuzzy graph (r=fg); and Pawlak set (r=ps). The rationale for constructing a large variety of constraints is that conventional crisp constraints are incapable of representing the meaning of propositions expressed in a natural language—most of which are intrinsically imprecise—in a form that lends itself to computation.

The elements of GCL are composite generalized constraints that are formed from generic generalized constraints by combination, modification, and qualification. An example of a generalized constraint in GCL is ((Z isp R) and (Z, Y) is S) is unlikely.

By construction, the generalized constraint language is maximally expressive. What this implies is that PNL is the largest subset of a natural language that admits precisiation. Informally, this implication serves as a basis for the conclusion that if a concept, X, cannot be defined in terms of PNL, then, in effect, it is undefinable or, synonymously, amorphic.

In this perspective, the highest level of definability hierarchy, which is the level above PNL-definability, is that of undefinability or amorphicity. A canonical example of amorphic concepts is that of causality. More specifically, it is not possible to construct a general definition of causality such that given any two events A and B and the question, “Did A cause B?” the question could be answered based on the definition. Equivalently, given any definition of causality, it will always be possible to construct examples to which the definition would not apply or yield counterintuitive results.

In dealing with an amorphic concept, X, what is possible—and what we generally do—is to restrict the domain of applicability of X to instances for which X is definable. For example, in the case of the concept of a summary, which is an amorphic concept, we could restrict the length, type, and other attributes of what we want to summarize. In this sense, an amorphic concept may be partially definable or, p-definable, for short. The concept of p-definability applies to all levels of the definability hierarchy.

In essence, PNL may be viewed as a collection of ordered pairs of the form (p, px), where p is a precisiable proposition in NL and p x is a precisiation of p, that is, its translation in GCL. In this sense, PNL may be viewed as a dictionary in which p is an entry and px is its meaning.

In scientific theories, a concept, X, is almost always defined as a crisp (bivalent) concept, meaning that the denotation of X is a crisp set in its universe of discourse. In THD, a concept, X, is associated with a quintuple (X, U, QCS, DF(L), D(DF)) in which X is the concept; U is the space of objects to which X is applicable; QCS is the qualitative complexity scale associated with X; DF(L) is a definition of X in a language L; and D(DF) is the domain of DF, that is, the set of objects to which DF is applicable.

The concept of a qualitative complexity scale plays a key role in THD. Basically, the qualitative complexity scale, QCS, is a linear clustering, QCC1, QCC2, …, QCCm, of qualitative complexity classes of objects in U such that: (a) objects in QCCi are roughly equally complex in relation to the definition, DF, of X; and (b) objects in QCCi+1 have higher complexity than those in Qi. For example, if X is the concept of volume, then QCC2 may be class of objects like trees; and QCC5 may be the class of objects like clothing. Each language in the definability hierarchy is associated with a critical threshold on the qualitative complexity scale such that the language cannot be applied to classes above the critical threshold.

As the lowest member of the definability hierarchy, the C language has a low expressive power, with the consequence that the associated critical threshold is near the low end of the of the qualitative complexity scale. In particular, the C language cannot be used to define fuzzy concepts. Thus, its use to define concepts which, in reality, are fuzzy concepts, leads to counterintuitive conclusions. An example is the conventional C-language-based definition of stability. Since stability, in general, is a matter of degree, its definition as a crisp concept leads to paradoxes similar to the ancient Greek sorites paradox. To define stability as a fuzzy concept—which in reality it is—what is needed is PNL. The same applies to the concept of causality. Thus, causality can be defined as a crisp concept only for complexity classes which lie close to the low end of the qualitative complexity scale.

Another important point is that almost every concept has some degree of amorphicity, with a concept such as causality being amorphic to a high degree. But even such basic concepts as volume, density, edge, derivative and optimality have domains of amorphicity which are apparent in many real-world settings. What this implies is that many basic concepts may require redefinition in terms of PNL.

Does PNL have a significant role to play in devising metrics of performance for intelligent systems? This is an issue that is not addressed in the brief sketch of the theory of hierarchical definability. But I have no doubt that it will, since the concept of intelligence is much too complex to lend itself to analysis through the use of existing bivalent-logic-based methods.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, University of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Intractability Principle and the Concept of Z-Hardness

(ARO) DAAH 04-96l-034l, BISC Program of UC Berkeley, (NASA) NCC2-l006, (NASA) NAC2-ll7, (ONR) N000l4-99-C-0298, (ONR) NOOOl4-96-l-0556, and (ONR) FDNOOl499l035

Almost all scientific theories are based on Aristotelian logic and probability theory. Use of these theories has led to brilliant successes that are visible to all. But what is less visible is that alongside the brilliant successes lie many problem areas where progress has been slow or nonexistent. We cannot automate driving in city traffic, we cannot construct programs that can summarize non-stereotypical stories, and our capacity to do economic forecasting without human intervention leaves much to be desired.

To understand the reasons for the mixed picture, it is germane to start with the thesis that, in general, a system for performing a specified task may be viewed as having three principal components: (1) hardware; (2) software; and (3) what may be called brainware--a complex of concepts, theories, methods, and algorithms that govern the functioning of hardware and software.

The capability of a system to perform a specified task is a function of the capabilities of its hardware, software, and brainware components. In general, there is a tradeoff between these capabilities. However, there are important classes of problems in which the overall capability is limited primarily by the structure of brainware and, more particularly, by the underlying systems for logical reasoning and dealing with uncertainty.

What may be called the Intractability Principle is a thesis that is focused on problems that are solvable by humans, but are difficult for machines. More specifically, the principle relates machine-solvability to the structure of brainware. A basic concept that underlies the principle is that of a qualitative complexity scale, or QCS for short. The low end of QCS consists of problem/tasks that are easy to solve/perform. The opposite holds for the high end. An M-scale is a QCS that applies to machines; an H-scale applies to humans. For simplicity, QCS is assumed to be linearly ordered; it can be partially ordered, if needed. An example of M-scale is the automation of driving a car. At the low end is the problem of automation of driving on a freeway with no traffic. At the high end is the automation of driving in Istanbul.

Stated informally, the thesis has two parts, (1) negative, and (2) positive, which are summarized in the following. (1) As the complexity of a problem increases, a critical threshold is reached beyond which a solution cannot be achieved through the use of techniques based on two-valued logical systems and probability theory. On an M-scale, problems that lie to the right of the critical threshold are said to be Z-hard. (2) Beyond the critical threshold, achievement of a solution necessitates the use of fuzzy-logic-based methodology of computing with words (CW) and the perception-based theory of probabilistic reasoning (PTp).

The basic idea underlying the Intractability Principle is that problems that lie beyond the critical threshold are intractable by conventional measurement-based methods. What is not widely recognized is that many problems that are simple for humans fall into this category. Here are a few examples of Z-hard problems.

(1) Automation of driving a car. In this case, as stated already, on one end of the complexity scale we have automation of driving on a freeway with no traffic. On the other end is automation of driving in Istanbul. Humans can drive in Istanbul without any measurements or any computations. At this juncture, no conceivable system can solve the problem of automation of driving in Istanbul.

(2) Machine-execution of the instruction: Make a right turn at the intersection. In this case, at one end of the scale we have an intersection on the outskirts of Princeton. On the other end, an intersection in the center of New York.

(3) Summarization. On one end of the scale is a short stereotypical story. On the other end is a book. As a task, summarization is an order of magnitude more complex than machine translation. In (1), (2), and (3), problems at the end of the scale are Z-hard.

There are two key points that require comment. First, in most real-world problems that are simple for humans, the critical threshold is not a remote limit that is of no concern. On the contrary, it underscores the intrinsic limitation of techniques based on Aristotelian logic and probability theory to deal with problems in which decision-relevant information is, for the most part, perception-based.

The second point relates to the pivotal role of the methodology of computing with words in making it possible to solve problems that lie beyond the critical threshold. The crux of the matter is that in most problems that lie beyond the critical threshold, perception-based information is described by propositions expressed in a natural language, e.g., "traffic is usually very heavy in late afternoon." In general, such propositions are f-granular in the sense that (1) the boundaries of perceived classes are unsharp; and (2) the values of attributes are granulated, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity, and functionality. The problem with theories based on Aristotelian logic and probability theory is that they do not have the capability to deal with f-granular propositions drawn from a natural language. In large measure, the importance of computing with words derives from the fact that it does provide this capability.

A key concept in computing with words is that of Precisiated Natural Language (PNL). Basically, PNL is a subset of a natural language, NL, which consists of propositions that are precisiable through translation into what is called the Generalized Constraint Language, GCL. PNL serves an important function as a definition language, with the language of fuzzy if-then rules being a sublanguage of PNL.

Computing with words provides a basis for an important generalization of probability theory--from standard probability theory, PT, to perception-based probability theory, PTp. Separately and in combination, computing with words and perception-based theory of probabilistic reasoning open the door to a solution of problems that lie beyond the critical threshold. This is the principal rationale for the positive thesis of the Intractability Principle.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division, Department of EECS, University of California, Berkeley, CA 94720-l776; Telephone: 5l0-642-4959; Fax: 5l0-642-l7l2; E-mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-1l7, ONR Grant NOOO14-96-1-0556, ONR Grant FDNOO14991035, ARO Grant DAAH 04-961-0341, and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## It is a Fundamental Limitation to Base Probability Theory on Bivalent Logic

Probability theory has long played—and is continuing to play—the role of the principal tool for dealing with problems in which uncertainty is a significant factor. The history of probability theory is a history of important advances in our understanding of decision-making in uncertain environments. But what we see alongside the brilliant successes are problem areas in which progress has been elusive. An example is the class of problems in which probabilities, utilities and relations are ill-defined in ways that put such problems well beyond the reach of existing methods.

A thesis that is put forth in this paper is that standard probability theory, call it PT, has fundamental limitations—limitations which are rooted in the fact that PT is based on bivalent logic. It is this thesis that underlies the radically-sounding title of the paper: It is a Fundamental Limitation to Base Probability Theory on Bivalent Logic.

The principal rationale for this thesis is that the conceptual structure of PT is directed at addressing the partiality of certainty but not the partiality of possibility and, most importantly, the partiality of truth. A direct consequence of these omissions is that PT lacks the capability to deal with information which is not just partially certain but also partially possible and/or partially true. An example is: Most analysts believe that it is very unlikely that there will be a significant increase in the price of oil in the near future.

The principal negative consequences of basing PT on bivalent logic are the following.

(1) Brittleness (discontinuity.) Almost all concepts in PT are bivalent in the sense that a concept, C, is either true or false, with no partiality of truth allowed. For example, events A and B are either independent or not independent. A process, P, is either stationary or nonstationary, and so on. An example of brittleness is: If all A’s are B’s and all B’s are C’s, then all A’s are C’s; but if almost all A’s are B’s and almost all B’s are C’s, then all that can be said is that the proportion of A’s in C’s is between 0 and 1. In particular, brittleness of independence creates serious problems in construction of Bayesian nets since only an epsilon separates independence from dependence.

(2) The dilemma of “it is possible but not probable.” A simple version of this dilemma is the following. Assume that A is a proper subset of B and that the Lebesgue measure of A is arbitrarily close to the Lebesgue measure of B. Now, what can be said about the probability measure, P(A), given the probability measure P(B)? The only assertion that can be made is that P(A) lies between 0 and P(B). The uniformativeness of this assessment of P(A) leads to counterintuitive conclusions. For example, suppose that with probability 0.99 Robert returns from work within one minute of 6pm. What is the probability that he is home at 6pm? Using PT, with no additional information or the use of the maximum entropy principle, the answer is: between 0 and 1. This simple example is an instance of a basic problem of what to do when we know what is possible but cannot assess the associated probabilities or probability distributions. A case in point relates to assessment of the probability of a worst case scenario.

(3) Incapability to operate on perception-based information. This is the most serious limitation of PT. It is rooted in the fact that much of human knowledge is perception-based, e.g., “Most Swedes are tall;” “It is foggy in San Francisco during the summer,” “It is unlikely to be warm tomorrow.” and “Usually Robert returns from work at about 6pm.”

Perceptions are intrinsically imprecise, reflecting the bounded ability of sensory organs and, ultimately, the brain, to resolve detail and store information. More concretely, perceptions are f-granular in the sense that (a) the boundaries of perceived classes are unsharp; and (b) the values of perceived attributes are granulated, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity or functionality.

F-granularity of perceptions puts them well beyond the reach of conventional meaning-representation methods based on bivalent logic. As a consequence, PT, by itself, lacks the capability to operate on perception-based information: As an illustration, PT cannot provide an answer to the query: What is the probability that Robert is home at about t pm, given the perception-based information that (a) usually Robert leaves his office at about 5:30 pm; and (b) usually it takes about 30 minutes to reach home.

What can be done to endow probability theory with the capability to operate on perception-based information? In a recent paper entitled “Toward a Perception-Based Theory of Probabilistic Reasoning with Imprecise Probabilities,” (Journal of Statistical Planning and Inference), I outlined an approach to a restructuring of probability theory which adds this capability to PT. Briefly, the proposed restructuring involves three stages of generalization: (a) f-generalization; (b) f.g-generalization; and (c) nl-generalization. More specifically:

(a) F-generalization involves a progression from crisp sets to fuzzy sets in PT, leading to a generalization of PT which is denoted as PT+. In PT+, probabilities, functions, relations, measures and everything else are allowed to have fuzzy denotations, that is, be a matter of degree. In particular, probabilities described as low, high, not very high, etc. are interpreted as labels of fuzzy subsets of the unit interval or, equivalently, as possibility distributions of their numerical values.

(b) F.g-generalization involves fuzzy granulation of variables, functions, relations, etc., leading to a generalization of PT which is denoted as PT++. By fuzzy granulation of a variable, X is meant a partition of the range of X into fuzzy granules, with a granule being a clump of values of X which are drawn together by indistinguishability, similarity, proximity or functionality. Membership functions of such granules are usually assumed to be triangular or trapezoidal. Basically, granularity reflects the bounded ability of the human mind to resolve detail and store information.

(c) Nl-generalization involves an addition to PT++ of a capability to operate on propositions expressed in a natural language, with the understanding that such propositions serve as descriptors of perceptions. Nl-generalization of PT leads to perception-based probability theory denoted as PTp. By construction, PTp is needed to answer the query in the Robert example: What is the probability that Robert is home at about t pm?

The principal thrust of what was said above may be viewed as a call for a recognition that standard probability theory has serious limitations—limitations which are rooted in its bivalent base and the fundamental conflict between bivalence and reality. What is needed to circumvent these limitations is a restructuring of the foundations of probability theory—a restructuring aimed principally at endowing probability theory with the capability to operate on perception-based information.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, University of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## The BISC Initiative: Fuzzy Logic and the Internet (FLINT); Perception based Information Processing and Analysis

Masoud Nikravesh1 and Dr. Bahram Parvin2
Berkeley Initiative in Soft Computing and BTExact Technologies (British Telecom)

This project describes the need for an initiative to design an intelligent search engine based on two main motivations:

The web environment is, for the most part, unstructured and imprecise. To deal with information in the web environment, we need a logic that supports modes of reasoning that are approximate rather than exact. While searches may retrieve thousands of hits, finding decision-relevant and query-relevant information in an imprecise environment is a challenging problem, which has to be addressed. Another less obvious issue is deduction in an unstructured and imprecise environment given the huge stream of complex information.

As a result, intelligent search engines with growing complexity and technological challenges are currently being developed. This requires new technology in terms of understanding, development, engineering design, and visualization. While the technological expertise of each component becomes increasingly complex, there is a need for better integration of each component into a global model adequately capturing the imprecision and deduction capabilities.

The initiative will bring an integrated approach to perception-based information processing and retrieval, including intelligent search engines by integrating the various components and achievements of its team members. The objective of this initiative is to develop an intelligent computer system with deductive capabilities to conceptually match and rank pages based on predefined linguistic formulations and rules defined by experts or based on a set of known homepages. The Conceptual Fuzzy Set (CFS) model will be used for intelligent information and knowledge retrieval through conceptual matching of both text and images (here defined as "Concept"). The selected query doesn't need to match the decision criteria exactly, which gives the system a more human-like behavior. The CFS can also be used for constructing fuzzy ontology or terms related to the context of search or query to resolve the ambiguity. We intend to combine the expert knowledge with soft computing tools of Berkeley groups. Expert knowledge needs to be partially converted into artificial intelligence that can better handle the huge information stream. In addition, sophisticated management work-flow needs to be designed to make optimal use of this information. We believe our current team is unique in the world of perception-based information processing and analysis in tackling this problem. We intend no less than changing the face and practice of the intelligent search engines for complex unstructured dynamic systems and imprecise environments such as the Internet. The new model can execute conceptual matching dealing with context-dependent word ambiguity and produce results in a format that permits the user to interact dynamically to customize and personalized its search strategy.

[1]
M. Nikravesh and B. Azvine, FLINT 2001, New Directions in Enhancing the Power of the Internet, UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M01/28, August 2001.
[2]
L. Vincenzo, M. Nikravesh, and L. A. Zadeh, Journal of Soft Computing, Special Issue: Fuzzy Logic and the Internet, Springer Verlag (to appear).
[3]
M. Nikravesh, Fuzzy Logic and Internet: Perception Based Information Processing and Retrieval, Berkeley Initiative in Soft Computing, Report No. 2001-2-SI-BT, September 2001.
[4]
M. Nikravesh, BISC and The New Millennium, Perception-based Information Processing, Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001.
[5]
M. Nikravesh and L. A. Zadeh, Perception Based Information Processing and Retrieval Application to User Profiling, Berkeley-BT project, 2000-2004.
1Staff
2LBNL

Send mail to the author : (nikravesh@cs.berkeley.edu)

## The BISC Initiative: The BISC Decision Support System

Masoud Nikravesh1 and Dr. Bahram Parvin2
Berkeley Initiative in Soft Computing

This project describes the need for an initiative for intelligent real-time automated decision-making and management based on two main motivations.

First, in recent years, a decline in revenue, need for more cost effective strategy, and multicriteria and multiattribute optimization in an imprecise and uncertain environment, have emphasized the need for risk and uncertainty management in the distributed and complex dynamic systems. Second, there exists an ever-increasing need to improve technology that provides a global solution to modeling, understanding, analyzing, and managing imprecision and risk in real-time automated decision-making for complex distributed dynamic systems.

As a result, intelligent distributed dynamic systems with growing complexity and technological challenges are currently being developed. This requires new technology in terms of development, engineering design, and virtual simulation models. Each of these components adds to the global sum of uncertainty about risk during the decision making process. While the technological expertise of each component becomes increasingly complex, there is a need for better integration of each component into a global model, adequately capturing the uncertainty on key strategic parameters. The uncertainty quantification on such key parameters is required in any type of decision analysis.

The initiative will bring an integrated approach to real-time automated and intelligent management decision making by integrating the various components and achievements of its team members. As the turnaround time for a decision and management teams becomes increasingly shorter, management decisions on new lines of product or service or to find a new alternative solution become increasingly complex, given the huge stream of often imprecise information. We intend to combine the expert knowledge with soft computing tools of UC Berkeley groups. Expert knowledge is important in decision making and management, but is becoming increasingly complex, time consuming, and expensive. Moreover, expertise from various disciplines is required and needs to be integrated to provide a global solution.

Therefore, expert knowledge needs to be partially converted into artificial intelligence that can better handle the huge information stream and management in making more sound decisions. In addition, sophisticated decision making and management work-flow need to be designed to make optimal use of this information. We believe our current team is unique in the world of intelligent decision making in tackling this problem. We intend no less than changing the face and practice of the real-time automated decision making process for complex dynamic systems.

The BISC decision support system key features are:

• Intelligent tools to assist decision makers in assessing the consequences of decisions made in an environment of imprecision, uncertainty, and partial truth, and providing a systematic risk analysis
• Intelligent tools to be used to assist decision makers in answering "what if" questions, examining numerous alternatives very quickly, and finding the value of the inputs to achieve a desired level of output
• Intelligent tools to be used with human interaction and feedback to achieve a capability of learning and adapting through time

BISC-DSS and Autonomous Multi-Agent System:

A key component of any autonomous multi-agent system--especially in an adversarial setting--is the decision module, which should be capable of functioning in an environment of imprecision, uncertainty, and imperfect reliability. BISC-DSS will be focused on the development of such a system and can be used as a decision-support system for ranking of decision alternatives. BISC-DSS can be used:

• As a global optimizer for planning decisions in a distributed environment
• To facilitate the solution of complex problems by a group of autonomous agents by speeding up the process of decision making, collaboration, and sharing the information, goals, and objectives
• To intelligently allocate resources given the degree of match between objectives and resources available
• To assist autonomous multi-agent systems in assessing the consequences of a decision made in an environment of imprecision, uncertainty, and partial truth, and providing a systematic risk analysis
• To assist multi-agent systems in answering "what if" questions, examining numerous alternatives very quickly, ranking of decision alternatives, and finding the value of the inputs to achieve a desired level of output

Other Potential Applications:

The areas in which the methodology could be implemented are diverse. Following are possible potentials outside the autonomous multi-agent system.

BISC-DSS can be integrated into TraS toolbox to develop the Intelligent Tracking System (ITraS). Given the information about suspicious activities such as phone calls, emails, meetings, credit card information, hotel, and airline reservations that are stored in a database containing the originator, recipient, locations, times, etc., we can use BISC-DSS and visual data mining to find suspicious patterns in data using geographical maps. The technology developed can detect unusual patterns, raise alarms based on classification of activities, and offer explanations based on automatic learning techniques for why a certain activity is placed in a particular class such as "safe," "suspicious," "dangerous," etc. The underlying techniques can combine expert knowledge and data-driven rules to continually improve its classification and adapt to dynamic changes in data and expert knowledge.

BISC-DSS can be integrated into a fuzzy conceptual set toolbox to develop a new tool for intelligent knowledge management and discovery (TIKManD). The model can be used to recognize terrorism activities through data fusion and mining, and pattern recognition technology, given online textual information through email or homepages and voice information given the wire tapping and/or chat lines or huge number of "tips" received immediately after the attack.

[1]
M. Nikravesh, BISC and the New Millennium, Perception-based Information Processing, Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001.
1Staff
2LBNL

Send mail to the author : (nikravesh@cs.berkeley.edu)

## Fuzzy Conceptual Matching: Tool for Intelligent Knowledge Management and Discovery in the Internet

Masoud Nikravesh1
Berkeley Initiative in Soft Computing and BTExact Technologies (British Telecom)

Given the ambiguity and imprecision of the "concept" in the Internet, which may be described with both textual and image information, the use of Fuzzy Conceptual Matching (FCM) is a necessity for search engines. In the FCM approach, the "concept" is defined by a series of keywords with different weights depending on the importance of each keyword. Ambiguity in concepts can be defined by a set of imprecise concepts. Each imprecise concept, in fact, can be defined by a set of fuzzy concepts. The fuzzy concepts can then be related to a set of imprecise words given the context. Imprecise words can then be translated into precise words given the ontology and ambiguity resolution through a clarification dialog. By constructing the ontology and fine-tuning the strength of links (weights), we could construct a fuzzy set to integrate piecewise the imprecise concepts and precise words to define the ambiguous concept.

In this project, first we will present the role of fuzzy logic in the Internet. Then we will present an intelligent model that can mine the Internet to conceptually match and rank homepages based on predefined linguistic formulations and rules defined by experts or based on a set of known homepages. The FCM model will be used for intelligent information and knowledge retrieval through conceptual matching of both text and images (here defined as "concept"). The FCM can also be used for constructing fuzzy ontology or terms related to the context of the query and search to resolve the ambiguity.

[1]
M. Nikravesh and L. A. Zadeh, Perception Based Information Processing and Retrieval Application to User Profiling, Berkeley-BT project, 2000-2004.
[2]
M. Nikravesh and B. Azvine, FLINT 2001, New Directions in Enhancing the Power of the Internet, UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M01/28, August 2001.
[3]
L. Vincenzo, M. Nikravesh, and L. A. Zadeh, Journal of Soft Computing, Special Issue: Fuzzy Logic and the Internet, Springer Verlag (to appear).
[4]
M. Nikravesh, Fuzzy Logic and Internet: Perception Based Information Processing and Retrieval, Berkeley Initiative in Soft Computing, Report No. 2001-2-SI-BT, September 2001.
[5]
M. Nikravesh, BISC and The New Millennium, Perception-based Information Processing, Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001.
1Staff

Send mail to the author : (nikravesh@cs.berkeley.edu)

## Fuzzy Query and Ranking with Application to Search Engines

Masoud Nikravesh1
Berkeley Initiative in Soft Computing and and BTExact Technologies (British Telecom)

Ranking/scoring is used to make billions of financing decisions each year serving an industry worth hundreds of billions of dollars. To a lesser extent, hundreds of millions of applications were processed by US universities, resulting in over 15 million college enrollments in 2000, and a total revenue/expenditure of over \$250 billion. College enrollments are expected to reach over 17 million by year 2010, a total revenue/expenditure of over \$280 billion.

Credit scoring was first developed in the 1950s and has been used extensively in the last two decades. In the early 1980s, the three major credit bureaus, Equitax, Experian, and TransUnion, worked with the Fair Isaac Company to develop generic scoring models that allow each bureau to offer an individual score based on the contents of the credit bureau's data. FICO is used to make billions of financing decisions each year, serving a hundred-billion-dollar industry. Credit scoring is a statistical method to assess an individual's credit worthiness and the likelihood that the individual will repay his/her loans based on his/her credit history and current credit accounts. The credit report is a snapshot of the credit history and the credit score is a snapshot of the risk at a particular point in time. Since 1995, scoring has made its biggest contribution in the world of mortgage lending. Mortgage investors such as Freddie Mac and Fannie Mae, the two main government-chartered companies that purchase billions of dollars of newly originated home loans annually, endorsed Fair Isaac credit bureau risk, ignoring subjective consideration, and agreed that lenders should also focus on other outside factors when making a decision.

When you apply for credit, whether it's a new credit card, car, student loan, mortgage, or financing, about 40 pieces of information from your credit card report are fed into the model. That model provides a numerical score designed to predict your risk as a borrower. When you apply for university/college admission, more than 20 pieces of information from your application are fed into the model. That model provides a numerical score designed to predict your success rate and risk as a student to be admitted.

In this project, we will introduce fuzzy query and ranking to predict the risk in an ever-changing world and imprecise environment, including subjective consideration for several applications including credit scoring, university admission, and locating your favorite restaurant. Fuzzy query and ranking is robust, provides better insight and a bigger picture, contains more intelligence about an underlying pattern in data, and provides the ability of flexible querying and intelligent searching. This greater insight makes it easy for users to evaluate the results related to the stated criterion and makes a decision faster with improved confidence. It is very useful for multiple criteria, and when users want to vary each criterion independently with different confidence degrees or weighting factors. In the case of crisp queries, we can make multi-criterion decisions and ranking where we use the function AND and OR to aggregate the predicates. In the extended Boolean model or fuzzy logic, one can interpret the AND as fuzzy-MIN and OR as fuzzy-MAX functions. Fuzzy querying and ranking is a very flexible tool in which linguistic concepts can be used in the queries and ranking in a very natural form. In addition, the selected objects do not need to match the decision criteria exactly, which gives the system a more human-like behavior. Incorporating an electronic intelligent knowledge-based search engine, the results will be in a format that permits the user to interact dynamically with the contained database and to customize and add information to the database. For instance, it will be possible to test an intuitive concept by dynamic interaction between software and the human mind. This will provide the ability to answer "what if?" questions in order to decrease uncertainty and provide a better risk analysis and, for example, to increase the chance for either admission or increasing score.

[1]
M. Nikravesh, "Billions of Transactions for an Industry Worth Hundreds of Billions of Dollars," Intelligent Processing and Manufacturing of Materials,, Vancouver, Canada, July-August 2001.
[2]
M. Nikravesh, "Credit Scoring for Billions of Financing Decisions," IFSA/NAFIPS Int. Conf. Fuzziness and Soft Computing in the New Millenium, Vancouver, Canada, July 2001.
[3]
M. Nikravesh and B. Azvine, FLINT 2001, New Directions in Enhancing the Power of the Internet, UC Berkeley Electronics Research Laboratory, Memorandum No. UCB/ERL M01/28, August 2001.
[4]
L. Vincenzo, M. Nikravesh, and L. A. Zadeh, Journal of Soft Computing, Special Issue: Fuzzy Logic and the Internet, Springer Verlag (to appear).
[5]
M. Nikravesh, BISC and The New Millennium, Perception-based Information Processing, Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001.
1Staff

Send mail to the author : (nikravesh@cs.berkeley.edu)

## The BISC Initiative: The Stanford-Berkeley Initiative (BISC-SCRF); Intelligent Reservoir Modeling for Optimized Asset Management Decision Making

Masoud Nikravesh1, Jef Caers 2, Andre G. Journel 3, Dr. Bahram Parvin4, and Dr. Fred Aminzadeh5
Berkeley Initiative in Soft Computing and Stanford Center for Reservoir Forecasting

This project describes the need for an initiative in reservoir modeling and management based on two main motivations:

• A recent decline in recovery factors and reserve replacement ratios, combined with a volatile oil market and a steady rise in world oil production that emphasize the need for risk and uncertainty management in oil exploration and production.
• An ever-increasing need to improve necessary technology that provides an efficient solution to modeling, understanding, analyzing, and managing uncertainty and risk in oil exploration and production. Such improvements are required in both the business related issues and earth sciences areas.

As a result, reservoirs with growing geological and economical challenges are currently being explored and produced. This requires the application of new technologies and intelligent data and knowledge integration methods to achieve necessary efficiency and E&P cost reduction.

While the technological expertise of each component (geology, geophysics, reservoir engineering) becomes increasingly interrelated, there is a need for better integration of different components into a comprehensive reservoir model, adequately capturing the uncertainty on key strategic reservoir parameters. The uncertainty quantification on such key parameters is required in any type of decision analysis.

The initiative will bring an integrated approach to reservoir modeling and reservoir management decision making by integrating the various components and achievements of its team members. As the turnaround time for a reservoir modeling team becomes increasingly shorter, management decisions on new wells or new fields become increasingly complex, given the huge stream of often imprecise and uncertain information.

We intend to combine the reservoir expert knowledge of the Stanford groups with soft computing tools of UC Berkeley groups. Expert knowledge is important in reservoir management, but is becoming increasingly complex, time consuming, and expensive. Moreover, expertise from various disciplines is required and needs to be integrated to provide a global solution.

Therefore, expert knowledge needs to be partially converted into artificial intelligence that can better handle the huge information stream and partially automate the process of modeling and decision-making. Integration of knowledge developed by each group will lead to new systems and work-flows for better, faster, and more efficient reservoir management in terms of decision making.

Decision making under uncertainty can only work if each uncertainty component has been properly accounted for in a global solution to reservoir modeling and management. We believe our current team is unique in the world of reservoir modeling in tackling this problem.

[1]
P. Wong, F. Aminzadeh, and M. Nikravesh, "Soft Computing for Reservoir Characterization and Modeling," Studies in Fuzziness and Soft Computing, Physica-Verlag/Springer-Verlag, Vol. 80, 2001.
[2]
M. Nikravesh, F. Aminzadeh, and L. A. Zadeh, Soft Computing and Intelligent Data Analysis in Oil Exploration, Elsevier (to appear).
[3]
M. Nikravesh and F. Aminzadeh, Reservoir Attributes and Properties for Intelligent Detection (RAPID), Elsevier (to appear).
[4]
M. Nikravesh, F. Aminzadeh, and L. A. Zadeh, "Soft Computing and Earth Sciences," Journal of Petroleum Science and Engineering, Special Issue I, Vol. 29, No. 3-4, May 2001 and Special Issue II, December 2001.
[5]
P. Wong and M. Nikravesh, "Soft Computing for Reservoir Characterization and Modeling," Special Issue, Journal of Petroleum Geology, December 2001.
1Staff
2Professor, Stanford University
3Professor, Stanford University
4Lawrence Berkeley National Laboratory
5Fact Inc.

Send mail to the author : (nikravesh@cs.berkeley.edu)

## The BISC Initiative: CW-CTP-PNL-PDA

Masoud Nikravesh1
Berkeley Initiative in Soft Computing

A New Direction in AI--Toward a Computational Theory of Perceptions:

Reflecting the bounded ability of the human brain to resolve detail, perceptions are intrinsically imprecise. In more concrete terms, perceptions are f-granular, meaning that (1) the boundaries of perceived classes are unsharp; and (2) the values of attributes are granulated, with a granule being a clump of values (points, objects) drawn together by indistinguishability, similarity, proximity, and functionality. F-granularity of perceptions puts them well beyond the reach of traditional methods of analysis based on predicate logic and/or probability theory. The computational theory of perceptions (CTP) that is outlined here adds to the armamentarium of AI a capability to compute and reason with perception-based information. The point of departure in CTP is the assumption that perceptions are described by propositions drawn from a natural language. In CTP, a proposition, p, is viewed as an answer to a question and the meaning of p is represented as a generalized constraint. To compute with perceptions, their descriptors are translated into what is called the Generalized Constraint Language (GCL). Then, a goal-directed constraint propagation is employed to answer a give query. A concept that plays a key role in CTP is that of precisiated natural language (PNL).

The computational theory of perceptions suggests a new direction in AI--a direction that may enhance the ability of AI to deal with real-world problems in which decision-relevant information is a mixture of measurements and perceptions. What is not widely recognized is that many important problems in AI fall into this category.

Precisiated Natural Language (PNL):

It is a deep-seated tradition in science to view the use of natural languages in scientific theories as a manifestation of mathematical immaturity. The rationale for this tradition is that natural languages are lacking in precision. However, what is not widely recognized is that adherence to this tradition carries a steep price--the inability to exploit the richness of natural languages in a way that lends itself to computation and automated reasoning.

In a significant departure from existing methods, the high expressive power of natural languages is harnessed by a process termed precisiation. In essence, if p is proposition in a natural language (NL), then precisiation of p results in a representation of the meaning of p in the form of what is referred to as a generalized constraint. With these constraints serving as basic building blocks, composite generalized constraints can be generated by combination, constraint propagation, modification, and qualification. The set of all composite generalized constraints and associated rules of generation and interpretation constitute the Generalized Constraint Languages (GCL). Translation from NL to GLC is governed by the constraint-centered semantics of natural languages (CSNL). Thus, through CSNL, GCL serves as precisiation language for NL. Precisiation Natural Language (PNL) is a subset of NL, which is equipped with constraint-centered semantics and is translatable into GLC. By construction, GCL is maximally expressive. In consequence, PNL is the largest subset of NL, which admits precisiation. The expressive power of PNL is far greater than that of conventional predicate-logic-based meaning-representation languages.

The concept of PNL opens the door to a significant enlargement of the role of natural languages in scientific theories and, especially, in information processing, decision, and control. In these and other realms, a particularly important function that PNL can serve is that of a concept definition language--a language that makes it possible to formulate precise definitions of new concepts and redefine those existing concepts that do not provide a good fit to reality.

Perception-Based Decision Analysis (PDA):

Decisions are based on information. More often than not, the information available is a mixture of measurements and perception. The problem with perceptions is that they are intrinsically imprecise. A concept that plays a key role in perception-based decision analysis is that of precisiated natural language, PNL. In PDA, precisiated natural language is employed to define the goals, constraints, relations, and decision-relevant information. An important sublanguage of PNL is the language of fuzzy if-then rules, FRL. In this language, a perception of a function, f, is described by a collection of fuzzy if-then rules. Such a collection is referred to as the fuzzy graph of function f. Employment of PNL in decision analysis adds an important capability--a capability to operate on perception-based information. The capability has the effect of substantially enhancing the ability of DA to deal with real-world problems. More fundamentally, the high expressive power of PNL opens the door to a redefinition of such basic concepts as optimality and causality. Perception-based decision analysis represents a significant change in direction in the evolution of decision analysis. As we move farther into the age of machine intelligence and automation of reasoning, the need for a shift from computing with numbers to computing with words, and from manipulation of measurements to manipulation of perceptions, will cease to be a matter of debate.

From Computing with Numbers to Computing with Words to Computation with Perceptions--A Paradigm Shift:

The theory put forth in this research is focused on the development of what is referred to as the computational theory of perceptions (CTP)--a theory that comprises a conceptual framework and a methodology for computing and reasoning with perceptions. The base for CTP is the methodology of computing with words (CW). In CW, the objects of computation are words and propositions drawn from a natural language. The point of departure in the computational theory of perceptions is the assumption that perceptions are described as propositions in a natural language. Furthermore, computing and reasoning with perceptions is reduced to computing and reasoning with words.

To be able to compute with perceptions it is necessary to have a means of representing their meaning in a way that lends itself to computation. Conventional approaches to meaning representation cannot serve this purpose because the intrinsic imprecision of perceptions puts them well beyond the expressive power of predicate logic and related systems.

In the computational theory of perceptions, representation of meaning is a preliminary to reasoning with perceptions--a process that starts with a collection of perceptions that constitute the initial data set (IDS) and terminates in a proposition or a collection of propositions that play the role of an answer to a query, that is, the terminal data set (TDS).

The principal aim of the computational theory of perceptions is the development of an automated capability to reason with perception-based information. Existing theories do not have this capability and rely instead on conversion of perceptions into measurements--a process that in many cases is infeasible, unrealistic, or counterproductive. In this perspective, addition of the machinery of the computational theory of perceptions to existing theories may eventually lead to theories that have a superior capability to deal with real-world problems and make it possible to conceive and design systems with a much higher MIQ (Machine IQ) than those we have today.

1Staff

Send mail to the author : (nikravesh@cs.berkeley.edu)

## Tool for Intelligent Knowledge Management and Discovery (TIKManD); Terrorism and Criminal Activity Recognition

Masoud Nikravesh1
Berkeley Initiative in Soft Computing

In this project, we will develop and deploy an intelligent computer system is called Tool for Intelligent Knowledge Management and Discovery (TIKManD). The system can mine Internet homepages, emails, chat lines, and/or authorized wire tapping information (which may include multi-lingual information) to recognize, conceptually match, and rank potential terrorist activities (both common and unusual) by the type and seriousness of the activities. This will be done automatically or semi-automatically based on predefined linguistic formulations and rules defined by experts or based on a set of known terrorist activities given the information provided through law enforcement databases (text and voices) and huge number of "tips" received immediately after the attack.

To use the wire tapped and/or chat information, we will use state-of-the-art technology to change voice to text. Given the fact that the information to be gathered will include both Arabic and Farsi information, we will use state-of-the-art Arabic-English and Farsi-English dictionaries and thesauruses. We intent to develop multi-lingual terrorism ontology and a dictionary that will be developed both through existing documents related to terrorism activities and defined by experts and will be linked to English ontology that will be built for terrorism recognition. An intelligent technique based on integrated Latent Semantic Indexing and Self Organization Map (LSI-SOM), developed at LBNL and UC Berkeley, will be used for this aspect of the project.

In addition, a Conceptual Fuzzy Set (CFS) model will be used for intelligent information and knowledge retrieval through conceptual matching of text and voice (here defined as "concept"). The CFS can be used for constructing fuzzy ontology or terms relating to the context of the investigation (terrorism) to resolve the ambiguity. This model can be used to calculate conceptually the degree of match to the object or query. In addition, the ranking can be used for intelligently allocating resources given the degree of match between objectives and resources available.

Finally, we will use our state-of-the-art technology as a tool for effective data fusion and data mining of textual and voice information. Data fusion would attempt to collect seemingly related data to recognize: (1) increased activities in a given geographic area, (2) unusual movements and travel patterns, (3) intelligence gathered through authorized wire tapping of suspected subgroups, (4) unexplainable increase or decrease in routine communication, and (5) integration of reports and voice (text/audio) from different sources.

Data mining would use the above data to pinpoint and extract useful information from the rest. This would also apply to the "response" part of the initiative on how to fuse the huge number of "tips" received immediately after the attack, filter out the irrelevant parts, and act on the useful subset of such data in a speedy manner.

For example, given data about movement of people of interest, we can use the proposed data analysis technology to identify suspicious patterns of behavior, raise alarms and do interactive data analysis on geographical maps of the world or specific countries. For email and telephone monitoring, standard search technology would not be suitable, as they are often short and non-standard, so the system will need significant prior knowledge about what to look for. Our recent work on fuzzy query, search engines, and ranking can be leveraged here.

Finally, in phase two we intend to design a secure and distributed IT infrastructure to provide a means for secure communication between resources and to build textual, voice, and image databases online. Given the distributed nature of the information sources, a federated database framework is required to distribute storage and information access across multiple locations.

Tracks the Terrorist (TrasT): phase one of this project includes: (1) text data mining, (2) ontology related to terrorism activities, (3) a tracking system, and (4) a federated data warehouse.

Text data mining and ontology: we intend to develop ontology and a dictionary that will be developed both through existing documents related to terrorism activities and defined by experts and will be linked to English thesauruses that will be used for terrorism recognition. An Intelligent technique based on integrated Latent Semantic Indexing and Self Organization Map (LSI-SOM) will be used in this project.

Tracking system: given the information about suspicious activities such as phone calls, emails, meetings, credit card information, hotel, and airline reservations that are stored in a database containing the originator, recipient, locations, times, etc. We can use visual data mining software developed at Btexact Technologies (UC Berkeley sponsor) to find suspicious patterns in data using geographical maps. The technology developed can detect unusual patterns, raise alarms based on classification of activities, and offer explanations based on automatic learning techniques for why a certain activity is placed in a particular class such as "safe," "suspicious," "dangerous," etc. The underlying techniques can combine expert knowledge and data driven rules to continually improve its classification and adapt to dynamic changes in data and expert knowledge.

The system is Java-based, web-enabled, and uses MapInfo (commercial mapping software) and Oracle database to display and access the data. It consists of two main subsystems: modeling and visualization. The modeling subsystem deals with data analysis and explanation generation, and visualization represents extracted patterns on geographical maps and enables visual interaction with the modeling back end. For example, US authorities have announced that 5 out of 19 suspected September 11 hijackers met in Las Vegas several days before the attack. Some of the terrorists attended the same training school, had communication tracks, and flew at the same time period to the same location in the same flight. In addition, some of the suspects were on the FBI watch list. The new model intends to capture and recognize such activities and raise alarms by profiling, e.g., spatial clustering.

Data warehouse architecture: we will leverage DOE security (funded through National Collaboratory) to provide a means for secure communication between resources and to build textual databases. Given the nature of the information sources, data warehouse architecture for information access across multiple locations is required.

1Staff

Send mail to the author : (nikravesh@cs.berkeley.edu)

## Perception-based Decision Analysis (PDA)

(ARO) DAAH 04-961-0341, BISC Program of UC Berkeley, (NASA) NAC2-117, (NASA) NCC2-1006, (ONR) N00014-99-C-0298, (ONR) N00014-96-1-0556, and (ONR) FDN0014991035

Born at the beginning of the second half of the last century, decision analysis, or DA for short, was the brainchild of von Neumann, Morgenstern, Wald, and other great intellects of that period. Decision analysis was, and remains, driven by a quest for a theory that is prescriptive, rigorous, quantitative, and precise. The question is: can this aim be achieved? A contention that is advanced in the following is that the aim is unachievable by a prolongation of existing theories. What is needed in decision analysis is a significant shift in direction--a shift from computing with numbers to computing with words and from manipulation of measurements to manipulation of perceptions.

Decisions are based on information. More often than not, the decision-relevant information is a mixture of measurements and perceptions--perceptions exemplified by: it is very unlikely that there will be a significant decrease in the price of oil in the near future. The problem with perceptions is that they are intrinsically imprecise, reflecting the bounded ability of the human mind to resolve detail and store information. More specifically, perceptions are f-granular in the sense that (1) the boundaries of perceived classes are unsharp; and (2) the values of perceived attributes are granular, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity, or functionality. For example, a perception of likelihood may be described as "very unlikely," and a perception of gain as "not very high."

Existing decision theories have a fundamental limitation--they lack the capability to operate on perception-based information. To add this capability to an existing theory, T, three stages of generalization are required: (1) f-generalization, which adds to T the capability to operate on fuzzy sets, leading to a generalized theory denoted as T+; (2) f.g-generalization, which leads to a theory denoted as T++, and adds to T+ the capability to compute with f-granular variables, e.g., a random variable that takes the values small, medium, and large with respective probabilities low, high, and low; and (3) nl-generalization, which leads to a theory denoted as Tp, and adds to T++ the capability to operate on perception-based information expressed in a natural language.

A concept that plays a key role in perception-based decision analysis is that of precisiated natural language, PNL. Basically, PNL is a subset of a natural language, NL, which consists of propositions that can be precisiated through translation into a generalized constraint language. A generalized constraint is expressed as X isr R, where X is the constrained variable; R is the constraining relation; and r is a discrete-valued indexing variable whose value defines the way in which R constrains X. The principal types of constrains are: possibilistic (r = blank); veristic (r = v); probabilistic (r = p); random set (r = rs); fuzzy graph (r = fg); usuality (r = u); and Pawlak set (r = ps).

More general constraints may be generated by combination, modification, and qualification. The collection of all such constraints is the Generalized Constraint Language, GCL. By construction, GCL is maximally expressive. As a consequence, PNL is the largest subset of NL which is precisiable through translation into GCL. In general, X, R, and r are implicit rather than explicit. Thus, if p is a proposition in a natural language, then its translation into GCL involves explicitation of X, R and r in its translate, X isr R.

In PDA, precisiated natural language is employed to define the goals, constraints, relations, and decision-relevant information. An important sublanguage of PNL is the language of fuzzy if-then rules, FRL. In this language, a perception of a function, Y = f(X), is described by a collection of fuzzy if-then rules, e.g., if X is small then Y is small; if X is medium then Y is large; if X is large then Y is small. Such a collection is referred to as the fuzzy graph of f, f*.

For example, if X is a random variable, then a perception of its probability distribution may be described as: if X is small then its probability is low; if X is medium then its probability is very low; and if X is large then its probability is high. More generally, employment of PNL in decision analysis adds an important capability--a capability to operate on perception-based information. This capability has the effect of substantially enhancing the ability of DA to deal with real-world problems. More fundamentally, the high expressive power of PNL opens the door to a redefinition of such basic concepts as optimality and causality. In particular, what is called into question is the validity of conventional approaches in which optimization is equated to a maximization of a scalar objective function. It is argued that to achieve a close rapport with reality, the use of PNL in optimization is a matter of necessity rather than choice.

Perception-based decision analysis represents a significant change in direction in the evolution of decision analysis. As we move farther into the age of machine intelligence and automation of reasoning, the need for a shift from computing with numbers to computing with words, and from manipulation of measurements to manipulation of perceptions, will cease to be a matter of debate. A case in point is the following example--referred to as the Robert example. These are three progressively difficult versions of the example. The simplest version is:

Suppose that I need to call Robert by phone at home. What I know is that usually Robert returns from work at about 6:00 p.m. My questions are: (1) What is the probability that Robert is home at 6:30 p.m.? (2) What is the earliest time at which the probability that Robert is home is high? and (3) At 6:30 p.m., should I call Robert person-to-person or station-to-station, assuming that the costs are, respectively, 1 and 2.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, Univeristy of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341, and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Precisiated Natural Language--Toward a Radical Enlargement of the Role of Natural Languages in Information Processing, Decision, and Control

(ARO) DAAH 04-961-0341, BISC Program of UC Berkeley, (NASA) NAC2-117, (NASA) NCC2-1006, (ONR) N00014-99-C-0298, (ONR) N00014-96-1-0556, and (ONR) FDN0014991035

It is a deep-seated tradition in science to view the use of natural languages in scientific theories as a manifestation of mathematical immaturity. The rationale for this tradition is that natural languages are lacking in precision. However, what is not recognized to the extent that it should, is that adherence to this tradition carries a steep price. In particular, a direct consequence is that existing scientific theories do not have the capability to operate on perception-based information exemplified by "Most Finns are honest." Such information is usually described in a natural language and is intrinsically imprecise, reflecting a fundamental limitation on the cognitive ability of humans to resolve detail and store information. Because of their imprecision, perceptions do not lend themselves to meaning-representation through the use of precise methods based on predicate logic. This is the principal reason why existing scientific theories do not have the capability to operate on perception-based information.

In a related way, the restricted expressive power of predicate-logic-based languages rules out the possibility of defining many basic concepts such as causality, resemblance, smoothness, and relevance in realistic terms. In this instance, as in many others, the price of precision is over-idealization and lack of robustness.

In a significant departure from existing methods, in the approach described in this work, the high expressive power of natural languages is harnessed by constructing what is called a precisiated natural language (PNL).

In essence, PNL is a subset of a natural language (NL)--a subset that is equipped with constraint-centered semantics (CSNL) and is translatable into what is called the Generalized Constraint Language (GCL). A concept that has a position of centrality in GCL is that of a generalized constraint expressed as X isr R, where X is the constrained variable, R is the constraining relation, and isr (pronounced as ezar) is a variable copula in which r is a discrete-valued variable whose value defines the way in which R constrains X. Among the principal types of constraints are possibilistic, veristic, probabilistic, random-set, usuality, and fuzzy-graph constraints.

With these constraints serving as basic building blocks, more complex (composite) constraints may be constructed through the use of a grammar. The collection of composite constraints forms the Generalized Constraint Language (GCL). The semantics of GCL is defined by the rules that govern combination and propagation of generalized constraints. These rules coincide with the rules of inference in fuzzy logic (FL).

A key idea in PNL is that the meaning of a proposition, p, in PNL may be represented as a generalized constraint that is an element of GCL. Thus, translation of p into GCL is viewed as an explicitation of X, R and r. In this sense, translation is equivalent to explicitation.

The concept of a precisiated natural language, the associated methodologies of computing with words, and the computational theory of perceptions open the door to a wide-ranging generalization and restructuring of existing theories, especially in the realms of information processing, decision, and control. In this perspective, what is very likely is that in coming years a number of basic concepts and techniques drawn from linguistics will be playing a much more important role in scientific theories than they do today.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, Univeristy of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341, and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## A Syllable, Articulatory-Feature, and Stress-Accent Model of Speech Recognition and Its Applications

Shuangyu Chang1
Berkeley Initiative in Soft Computing (BISC)

Current-generation automatic speech recognition (ASR) systems assume that words are readily decomposable into constituent phonetic components ("phonemes"). A detailed linguistic dissection of state-of-the-art speech recognition systems indicates that the conventional phonemic "beads-on-a-string" approach is of limited utility, particularly with respect to informal, conversational material. The study shows that there is a significant gap between the observed data and the pronunciation models of current ASR systems. It also shows that many important factors affecting recognition performance are not modeled explicitly in these systems.

Our research analyzes spontaneous speech with respect to three important, but often neglected, components of speech (at least with respect to English ASR). These components are articulatory-acoustic features (AFs), the syllable and stress accent. Analysis results provide evidence for an alternative approach of speech modeling, one in which the syllable assumes preeminent status and is melded to the lower as well as the higher tiers of linguistic representation through the incorporation of prosodic information such as stress accent. Using concrete examples and statistics from spontaneous speech material it is shown that there exists a systematic relationship between the realization of AFs and stress accent in conjunction with syllable position. This relationship can be used to provide an accurate and parsimonious characterization of pronunciation variation in spontaneous speech. An approach to automatically extract AFs from the acoustic signal is also developed, as is a system for the automatic stress-accent labeling of spontaneous speech.

Based on the results of these studies a syllable-centric, multi-tier model of speech recognition is proposed. The model explicitly relates AFs, phonetic segments and syllable constituents to a framework for lexical representation, and incorporates stress-accent information into recognition. A test-bed implementation of the model is developed using a fuzzy-based approach for combining evidence from various AF sources and a pronunciation-variation modeling technique using AF-variation statistics extracted from data. Experiments on a limited-vocabulary speech recognition task using both automatically derived and fabricated data demonstrate the advantage of incorporating AF and stress-accent modeling within the syllable-centric, multi-tier framework, particularly with respect to pronunciation variation in spontaneous speech. The multi-tier model and the preliminary implementation are also being extended to provide speech-based interface to specific applications in areas such as telephone directory assistance and internet information retrieval.

[1]
S. Chang, "A Syllable, Articulatory-Feature, and Stress-Accent Model of Speech Recognition," PhD dissertation, UC Berkeley, 2002.
[2]
S. Chang and S. Greenberg, "Application of Fuzzy-Integration-based Multiple-Information Aggregation in Automatic Speech Recognition," Proc. IEEE Conf. Fuzzy Information Processing, Beijing, March 2003.
1Postdoctoral Researcher

Send mail to the author : (shawnc@eecs.berkeley.edu)

## Standard Probability Theory Does not Have the Capability to Deal with Perception-based Information--the Robert Example

The Robert Example is named after my colleague and good friend, Robert Wilensky. The example is intended to serve as a test of the ability of standard probability theory (PT) to deal with perception-based information, e.g., "Usually Robert returns from work at about 6pm." An unorthodox view that is articulated in the following is that to add to PT the capability to process perception-based information it is necessary to generalize PT in three stages. The first stage, f-generalization, adds to PT the capability to deal with fuzzy probabilities and fuzzy events--a capability which PT lacks. The result of generalization is denoted as PT+.

The second stage, g-generalization, adds to PT+ the capability to operate on granulated (linguistic) variables and relations. Granulation plays a key role in exploiting the tolerance for imprecision for achieving robustness, tractability and data compression. G-generalization of PT+ or, equivalently, f.g-generalization of PT, is denoted as PT++.

The third stage, nl-generalization, adds to PT++ the capability to operate on information expressed in a natural language, e.g., "It is very unlikely that there will be a significant increase in the price of oil in the near future." Such information will be referred to as perception-based, and, correspondingly, nl-generalization of PT, PTp, will be referred to as perception-based probability theory. PTp subsumes PT as a special case.

The Robert Example is a relatively simple instance of problems which call for the use of PTp. Following is its description.

I want to call Robert in the evening, at a time when he is likely to be home. The question is: At what time, t, should I call Robert? The decision-relevant information is the probability, P(t), that Robert is home at about t pm.

There are three versions, in order of increasing complexity, of perception-based information which I can use to estimate P(t).

Version l. Usually Robert returns from work at about 6 pm.

Version 2. Usually Robert leaves his office at about 5:30 pm, and usually it takes about 30 minutes to get home.

Version 3. Usually Robert leaves office at about 5:30 pm. Because of traffic, travel time depends on when he leaves. Specifically, if Robert leaves at about 5:20 or earlier, then travel time is usually about 25 min.; if Robert leaves at about 5:30 pm, then travel time is usually about 30 min; if Robert leaves at 5:40 pm or later, travel time is usually about 35 min.

The problem is to compute P(t) based on this information. Using PTp, the result of computation would be a fuzzy number which represents P(t). A related problem is: What is the earliest time for which P(t) is high?

Solution of Version l using PTp is described in the paper "Toward a Perception-Based Theory of Probabilistic Reasoning with Imprecise Probabilities," which is scheduled to appear in a forthcoming issue of the Journal of Statistical Planning and Inference.

It is of interest to note that solution of a crisp version of Version l leads to counterintuitive results. Specifically, assume that with probability 0.9 Robert returns from work at 6 pm plus/minus l5 min; and that P(t) is the probability that Robert is home at exactly t pm. Then it is easy to verify that P(t)>0.9 for t>6:l5; P(t) is between 0 and l for 5:456:l5, but becomes indeterminate for t<6:l5. This phenomenon is an instance of what may be called the dilemma of "it is possible but not probable."

* Professor in the Graduate School and director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, University of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712;E-Mail: zadeh@cs.berkeley.edu. Research supported in part by ONR N00014-00-1-0621, ONR N00014-00-1-0621, ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341 and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Toward a Computational Theory of Perceptions

Perceptions play a central role in human cognition. There are close links between perceptions, consciousness and our ability to reason and make decisions on both conscious and subconscious levels. There is an enormous literature dealing with perceptions from every conceivable point of view. And yet, what is not in existence is a theory which makes it possible to compute with perceptions. A framework for such a theory—the computational theory of perceptions (CTP)—is described.

In CTP, the point of departure is the human ability to describe perceptions in a natural language. Thus, in CTP a perception is equated to its description, e.g., “It was very warm during much of the early part of summer.”

Perceptions are intrinsically imprecise, reflecting the bounded ability of sensory organs and ultimately the brain, to resolve detail and store information. More specifically, perceptions are f-granular in the sense that (a) the boundaries of perceived classes are imprecise; and (b) the values of attributes are granular, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity or functionality. F-granularity of perceptions puts them well beyond the reach of existing meaning-representation systems based on bivalent logic.

In CTP, the meaning of a perception, p, is represented through the use of what is referred to as precisiated natural language (PNL). In PNL, the meaning of p is expressed as a generalized constraint on a variable, X, which, in general, is implicit on p. A collection of rules governing generalized constraint propagation is employed for reasoning and computing with perceptions.

CTP adds to existing theories the capability to operate on perception-based information exemplified by “It is very unlikely that there will be a significant increase in the price of oil in the near future.” This capability provides a basis for extending the ability of existing theories—and especially probability theory—to deal with real-world problems.

* Professor in the Graduate School and Director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Resear ch Laboratory, Department of EECS, University of California, Berkeley, CA 94720- 1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Toward a Perception-based Theory of Probabilistic Reasoning with F-Granular Probability Distributions

The point of departure in perception-based probability theory, PTp, is the postulate that subjective probabilities are not numerical—as they are assumed to be in the theory of subjective probabilities. Rather, subjective probabilities are perceptions of likelihood, similar to perceptions of length, distance, speed, age, taste resemblance, etc.

Reflecting the bounded ability of human sensory organs and, ultimately, the brain, to resolve detail and store information, perceptions are intrinsically imprecise. More specifically, perceptions are f-granular in the sense that (a) the mental boundaries of perceived classes are fuzzy, and (b) the perceived values of attributes are granular, with a granule being a fuzzy clump of values drawn together by indistinguishability, similarity, proximity, or functionality. For example, perceptions of age may be described as young, middle-aged, old, very old, etc.

An immediate consequence of the postulate that subjective probabilities are perceptions of likelihood, is the postulate that subjective probabilities are f-granular. Thus, the perceived values of the subjective probability of an event may be described as low, medium, high, very high, etc.

A pivotal concept in PTp is that of a f-granular probability distribution. More specifically, if X is a real-valued random variable with granules labeled small, medium and large, then its f-granular probability distribution may be expressed symbolically as P=low\small+high\medium+low\large, with the understanding that low, for example, is a perception of the probability that X is small.

In the main, PTp is a system for computing, reasoning, and decision-making with f-granular probability distributions. Conceptually and computationally, PTp is significantly more complex than standard, measurement-based probability theory, PT. The importance of PTp derives from the fact that, as a generalization of PT, it opens the door to addressing problems and issues which are beyond the reach of PT.

* Professor in the Graduate School and Director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, University of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu.

Send mail to the author : (zadeh@cs.berkeley.edu)

## Toward a Theory of Hierarchical Definability (THD)--Causality is Undefinable

(ARO) DAAH 04-961-0341, BISC Program of UC Berkeley, (NASA) NAC2-117, (NASA) NCC2-1006, (ONR) N00014-99-C-0298, (ONR) N00014-96-1-0556, and (ONR) FDN0014991035

Attempts to formulate mathematically precise definitions of basic concepts such as causality, randomness, and probability, have a long history. The concept of hierarchical definability that is outlined in this project suggests that such definitions may not exist. Furthermore, it suggests that existing definitions of many basic concepts, among them those of linearity stability, statistical independence, and Pareto-optimality, may be in need of reformulation.

In essence, definability is concerned with whether and how a concept, X, can be defined in a way that lends itself to mathematical analysis and computation. In mathematics, definability of mathematical concepts is taken for granted. But as we move farther into the age of machine intelligence and automated reasoning, the issue of definability is certain to grow in importance and visibility, raising basic questions that are not easy to resolve.

To be more specific, let X be the concept of, say, a summary, and assume that I am instructing a machine to generate a summary of a given article or a book. To execute my instruction, the machine must be provided with a definition of what is meant by a summary. It is somewhat paradoxical that we have summarization programs that can summarize, albeit in a narrowly prescribed sense, without being able to formulate a general definition of summarization. The same applies to the concepts of causality, randomness, and probability. Indeed, it may be argued that these and many other basic concepts cannot be defined within the conceptual framework of classical logic and set theory.

The point of departure in our approach to definability is the assumption that definability has a hierarchical structure. Furthermore, it is understood that a definition must be unambiguous, precise, operational, general, and co-extensive with the concept it defines.

In THD, a definition of X is represented as a quadruple (X, L, C, D), where L is the language in which X is defined; C is the context; and D is the definition. The context, C, delimits the class of instances to which D applies, that is, C defines the domain of D. For example, if X is the concept of volume, D may not apply to a croissant or a tree.

The language, L, is assumed to be a member of hierarchy represented as (NL, BL, F, F.G, PNL). In this hierarchy, NL is a natural language—a language which is in predominant use as a definition language in social sciences and law; and BL is the binary-logic-based mathematical language used in all existing scientific theories. Informally, a concept, X, is BL-definable if it is a crisp concept, e.g., a prime number, a linear system or a Gaussian distribution. F is a mathematical language based on fuzzy logic and F.G is an extension of F in which variables and relations are allowed to have granular structure. The last member of the hierarchy is PNL—Precisiated Natural Language. PNL is a maximally-expressive language which subsumes BL, F and F.G. In particular, the language of fuzzy if-then rules is a sublanguage of PNL.

X is a fuzzy concept if its denotation, D(X), is a fuzzy set in its universe of discourse. A fuzzy concept is associated with a membership function that assigns to each point, u, in the universe of discourse of X, the degree to which u is a member of D(X). Alternatively, a fuzzy concept may be defined algorithmically in terms of other fuzzy concepts. Examples of fuzzy concepts are: small number, strong evidence and similarity. It should be noted that many concepts associated with fuzzy sets are crisp concepts. An example is the concept of a convex fuzzy set. Most fuzzy concepts are context-dependent.

The next level in the hierarchy is that of F.G-definability, with G standing for granular, and F.G denoting the conjunction of fuzzy and granular. Informally, in the case of a concept which is F.G-granular, the values of attributes are granulated, with a granule being a clump of values that are drawn together by indistinguishability, similarity, proximity, or functionality. F.G-granularity reflects the bounded ability of the human mind to resolve detail and store information. An example of an F.G-granular concept that is traditionally defined as a crisp concept, is that of statistical independence. This is a case of misdefinition--a definition that is applied to instances for which the concept is not defined, e.g., fuzzy events. In particular, a common misdefinition is to treat a concept as if it were BL-definable, whereas in fact it is not. In THD, the concept of context serves to differentiate between "defined" and "definability."

The next level is that of PNL-definability. Basically, PNL consists of propositions drawn from a natural language that can be precisiated through translation into what is called precisiation language. An example of a proposition in PNL is: It is very unlikely that there will be a significant increase in the price of oil in the near future.

In the case of PNL, the precisiation language is the Generalized Constraint Language (GCL). A generic generalized constraint is represented as Z isr R, where Z is the constrained variable, R is the constraining relation and r is a discrete-valued indexing variable whose values define the ways in which R constrains Z. The principal types of constraints are: possibilistic (r = blank); veristic (r = v); probabilistic (r = p); random set (r = rs); usuality (r = u); fuzzy graph (r = fg); and Pawlak set (r = ps). The rationale for constructing a large variety of constraints is that conventional crisp constraints are incapable of representing the meaning of propositions expressed in a natural language--most of which are intrinsically imprecise--in a form that lends itself to computation.

The elements of GCL are composite generalized constraints that are formed from generic generalized constraints by combination, modification and qualification. An example of a generalized constraint in GCL is ((Z isp R) and (Z, Y) is S) is unlikely.

By construction, the Generalized Constraint Language is maximally expressive. What this implies is that PNL is the largest subset of a natural language that admits precisiation. Informally, this implication serves as a basis for the conclusion that if a concept, X, cannot be defined in terms of PNL, then, in effect, it is undefinable or, synonymously, amorphic.

In this perspective, the highest level of definability hierarchy, which is the level above PNL-definability, is that of undefinability or amorphicity. A canonical example of an amorphic concept is that of causality. More specifically, is it not possible to construct a general definition of causality such that given any two events A and B and the question, "Did A cause B?", the question could be answered based on the definition. Equivalently, given any definition of causality, it will always be possible to construct examples to which the definition would not apply or yield counterintuitive results. In general, definitions of causality are non-operational because of infeasibility of conducting controlled experiments.

In dealing with an amorphic concept, X, what is possible--and what we generally do--is to restrict the domain of applicability of X to instances for which X is definable. For example, in the case of the concept of a summary, which is an amorphic concept, we could restrict the length, type, and other attributes of what we want to summarize. In this sense, an amorphic concept may be partially definable or, p-definable, for short. The concept of p-definability applies to all levels of the definability hierarchy.

The use of PNL as a definition language opens the door to definition of concepts of the form "approximate X," e.g., approximate linearity, approximate stability and, most importantly, approximate theorem. What we do not want to do is to use a BL-based language to define "approximate X," since such BL-based definitions may lead to counterintuitive conclusions.

The theory of hierarchical definability is not a theory in the traditional spirit. The definitions are informal and conclusions are not theorems. Nonetheless, it serves a significant purpose by raising significant questions about a basic issue--the issue of definability of concepts that lie at the center of scientific theories.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, Univeristy of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341, and the BISC Program of UC Berkeley.

Send mail to the author : (zadeh@cs.berkeley.edu)

## What Is Fuzzy Logic and What Are Its Applications?

Fuzzy logic has been—and to some extent still is—an object of controversy. Some are turned-off by its name. But, more importantly, fuzzy logic is tolerant of imprecision and partial truth. It is this tolerance that is in conflict with the deep-seated Cartesian tradition of aiming at truth which is bivalent, with no shades of gray allowed.

There are many misconceptions about fuzzy logic. In large measure, the misconceptions reflect the fact that the term “fuzzy logic” has two distinct interpretations. More specifically, in a narrow sense, fuzzy logic is the logic of approximate reasoning. But in a wider sense—which is in dominant use today—fuzzy logic, denoted as FL, is coextensive with the theory of fuzzy sets, and contains fuzzy logic in its narrow sense as one of its branches. In fact, most applications of FL involve modes of analysis which are computational rather than logical in nature.

Fuzzy logic, FL, has four principal facets. First, the logical facet, FL , which is fuzzy logic in its narrow sense. Second, the set-theoretic facet, FLs, which is concerned with classes having unsharp boundaries, that is, with fuzzy sets. Third, the relational facet, FLr, which is concerned with linguistic variables, fuzzy if-then rules and fuzzy relations. It is this facet that underlies almost all applications of fuzzy logic in control, decision analysis, industrial systems, and consumer products. And fourth, the epistemic facet, FLe, which is concerned with knowledge, meaning, and linguistics. One of the important branches of FLe is possibility theory.

A concept which has a position of centrality in FL is that of fuzzy granularity or, simply, f-granularity. F-granularity reflects the bounded ability of human sensory organs and, ultimately, the brain, to resolve detail and store information. In particular, human perceptions are, for the most part, f-granular in the sense that (a) the boundaries of perceived classes are fuzzy, and (b) the perceived attributes are granulated, with a granule being a clump of values drawn together by indistinguishability, similarity, proximity or functionality. In this perspective, the colors red, blue, green, etc., may be viewed as labels of granules of perception of color.

Precision carries a cost. This is the main reason why in most of its applications, the machinery of fuzzy logic is employed to exploit the tolerance for imprecision for achieving tractability, robustness and low solution cost. In fact, it is the tolerance for imprecision that underlies the remarkable human capability to perform a wide variety of physical and mental tasks, e.g., drive in city traffic, based solely on perceptions, without any measurements and any computations. It is this capability that motivated the development of fuzzy-logic-based computational theory of perceptions (CTP). Existing theories and, in particular, probability theory, do not have the capability to operate on perception-based information.

The computational theory of perceptions is a branch of the fuzzy-logic-based methodology of computing with words (CW). Development of the methodology of computing with words is an important event in the evolution of fuzzy logic. Eventually, it may lead to a radical enlargement of the role of natural languages in information processing, decision, and control.

* Professor in the Graduate School and Director, Berkeley initiative in Soft Computing (BISC), Computer Science Division and the Electronics Resear ch Laboratory, Department of EECS, University of California, Berkeley, CA 94720- 1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu.

Send mail to the author : (zadeh@cs.berkeley.edu)

Yutaka Hata1