## Toward a Theory of Hierarchical Definability (THD)--Causality is Undefinable

(ARO) DAAH 04-961-0341, BISC Program of UC Berkeley, (NASA) NAC2-117, (NASA) NCC2-1006, (ONR) N00014-99-C-0298, (ONR) N00014-96-1-0556, and (ONR) FDN0014991035

Attempts to formulate mathematically precise definitions of basic concepts such as causality, randomness, and probability, have a long history. The concept of hierarchical definability that is outlined in this project suggests that such definitions may not exist. Furthermore, it suggests that existing definitions of many basic concepts, among them those of linearity stability, statistical independence, and Pareto-optimality, may be in need of reformulation.

In essence, definability is concerned with whether and how a concept, X, can be defined in a way that lends itself to mathematical analysis and computation. In mathematics, definability of mathematical concepts is taken for granted. But as we move farther into the age of machine intelligence and automated reasoning, the issue of definability is certain to grow in importance and visibility, raising basic questions that are not easy to resolve.

To be more specific, let X be the concept of, say, a summary, and assume that I am instructing a machine to generate a summary of a given article or a book. To execute my instruction, the machine must be provided with a definition of what is meant by a summary. It is somewhat paradoxical that we have summarization programs that can summarize, albeit in a narrowly prescribed sense, without being able to formulate a general definition of summarization. The same applies to the concepts of causality, randomness, and probability. Indeed, it may be argued that these and many other basic concepts cannot be defined within the conceptual framework of classical logic and set theory.

The point of departure in our approach to definability is the assumption that definability has a hierarchical structure. Furthermore, it is understood that a definition must be unambiguous, precise, operational, general, and co-extensive with the concept it defines.

In THD, a definition of X is represented as a quadruple (X, L, C, D), where L is the language in which X is defined; C is the context; and D is the definition. The context, C, delimits the class of instances to which D applies, that is, C defines the domain of D. For example, if X is the concept of volume, D may not apply to a croissant or a tree.

The language, L, is assumed to be a member of hierarchy represented as (NL, BL, F, F.G, PNL). In this hierarchy, NL is a natural language—a language which is in predominant use as a definition language in social sciences and law; and BL is the binary-logic-based mathematical language used in all existing scientific theories. Informally, a concept, X, is BL-definable if it is a crisp concept, e.g., a prime number, a linear system or a Gaussian distribution. F is a mathematical language based on fuzzy logic and F.G is an extension of F in which variables and relations are allowed to have granular structure. The last member of the hierarchy is PNL—Precisiated Natural Language. PNL is a maximally-expressive language which subsumes BL, F and F.G. In particular, the language of fuzzy if-then rules is a sublanguage of PNL.

X is a fuzzy concept if its denotation, D(X), is a fuzzy set in its universe of discourse. A fuzzy concept is associated with a membership function that assigns to each point, u, in the universe of discourse of X, the degree to which u is a member of D(X). Alternatively, a fuzzy concept may be defined algorithmically in terms of other fuzzy concepts. Examples of fuzzy concepts are: small number, strong evidence and similarity. It should be noted that many concepts associated with fuzzy sets are crisp concepts. An example is the concept of a convex fuzzy set. Most fuzzy concepts are context-dependent.

The next level in the hierarchy is that of F.G-definability, with G standing for granular, and F.G denoting the conjunction of fuzzy and granular. Informally, in the case of a concept which is F.G-granular, the values of attributes are granulated, with a granule being a clump of values that are drawn together by indistinguishability, similarity, proximity, or functionality. F.G-granularity reflects the bounded ability of the human mind to resolve detail and store information. An example of an F.G-granular concept that is traditionally defined as a crisp concept, is that of statistical independence. This is a case of misdefinition--a definition that is applied to instances for which the concept is not defined, e.g., fuzzy events. In particular, a common misdefinition is to treat a concept as if it were BL-definable, whereas in fact it is not. In THD, the concept of context serves to differentiate between "defined" and "definability."

The next level is that of PNL-definability. Basically, PNL consists of propositions drawn from a natural language that can be precisiated through translation into what is called precisiation language. An example of a proposition in PNL is: It is very unlikely that there will be a significant increase in the price of oil in the near future.

In the case of PNL, the precisiation language is the Generalized Constraint Language (GCL). A generic generalized constraint is represented as Z isr R, where Z is the constrained variable, R is the constraining relation and r is a discrete-valued indexing variable whose values define the ways in which R constrains Z. The principal types of constraints are: possibilistic (r = blank); veristic (r = v); probabilistic (r = p); random set (r = rs); usuality (r = u); fuzzy graph (r = fg); and Pawlak set (r = ps). The rationale for constructing a large variety of constraints is that conventional crisp constraints are incapable of representing the meaning of propositions expressed in a natural language--most of which are intrinsically imprecise--in a form that lends itself to computation.

The elements of GCL are composite generalized constraints that are formed from generic generalized constraints by combination, modification and qualification. An example of a generalized constraint in GCL is ((Z isp R) and (Z, Y) is S) is unlikely.

By construction, the Generalized Constraint Language is maximally expressive. What this implies is that PNL is the largest subset of a natural language that admits precisiation. Informally, this implication serves as a basis for the conclusion that if a concept, X, cannot be defined in terms of PNL, then, in effect, it is undefinable or, synonymously, amorphic.

In this perspective, the highest level of definability hierarchy, which is the level above PNL-definability, is that of undefinability or amorphicity. A canonical example of an amorphic concept is that of causality. More specifically, is it not possible to construct a general definition of causality such that given any two events A and B and the question, "Did A cause B?", the question could be answered based on the definition. Equivalently, given any definition of causality, it will always be possible to construct examples to which the definition would not apply or yield counterintuitive results. In general, definitions of causality are non-operational because of infeasibility of conducting controlled experiments.

In dealing with an amorphic concept, X, what is possible--and what we generally do--is to restrict the domain of applicability of X to instances for which X is definable. For example, in the case of the concept of a summary, which is an amorphic concept, we could restrict the length, type, and other attributes of what we want to summarize. In this sense, an amorphic concept may be partially definable or, p-definable, for short. The concept of p-definability applies to all levels of the definability hierarchy.

The use of PNL as a definition language opens the door to definition of concepts of the form "approximate X," e.g., approximate linearity, approximate stability and, most importantly, approximate theorem. What we do not want to do is to use a BL-based language to define "approximate X," since such BL-based definitions may lead to counterintuitive conclusions.

The theory of hierarchical definability is not a theory in the traditional spirit. The definitions are informal and conclusions are not theorems. Nonetheless, it serves a significant purpose by raising significant questions about a basic issue--the issue of definability of concepts that lie at the center of scientific theories.

* Professor in the Graduate School and Director, Berkeley Initiative in Soft Computing (BISC), Computer Science Division and the Electronics Research Laboratory, Department of EECS, Univeristy of California, Berkeley, CA 94720-1776; Telephone: 510-642-4959; Fax: 510-642-1712; E-Mail: zadeh@cs.berkeley.edu. Research supported in part by ONR Contract N00014-99-C-0298, NASA Contract NCC2-1006, NASA Grant NAC2-117, ONR Grant N00014-96-1-0556, ONR Grant FDN0014991035, ARO Grant DAAH 04-961-0341, and the BISC Program of UC Berkeley.