Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

An On-Line Computational Model of Human Sentence Interpretation: A Theory of the Representation and Use of Linguistic Knowledge

Daniel Jurafsky

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-92-676
March 1992

http://www.eecs.berkeley.edu/Pubs/TechRpts/1992/CSD-92-676.pdf

This dissertation presents a model of the human sentence interpretation process, which attempts to meet criteria of adequacy imposed by the different paradigms of sentence interpretation. These include the need to produce a high-level interpretation, to embed a linguistically motivated grammar, and to be compatible with psycholinguistic results on sentence processing.

The model includes a theory of grammar called Construction-Based Interpretative Grammar (CIG) and an interpreter which uses the grammar to build an interpretation for single sentences. An implementation of the interpreter has been built called Sal.

Sal is an on-line interpreter, reading words one at a time and updating a partial interpretation of the sentence after each constituent. This constituent-by-constituent interpretation is more fine-grained and hence more on-line than most previous models. Sal is strongly interactionist in using both bottom-up and top-down knowledge in an evidential manner to access a set of constructions to build interpretations. It uses a coherence-based selection mechanism to choose among these candidate interpretations, and allows temporary limited parallelism to handle local ambiguities. Sal's architecture is consistent with a large number of psycholinguistic results.

The interpreter embodies a number of strong claims about sentence processing. One claim is uniformity, with respect to both representation and process. In the grammar, a single kind of knowledge structure, the grammatical construction, is used to represent lexical, syntactic, idiomatic, and semantic knowledge. CIG thus does not distinguish between the lexicon, the idiom dictionary, the syntactic rule base, and the semantic rule base. Uniformity in processing means that there is no distinction between the lexical analyzer, the parser, and the semantic interpreter. Because these kinds of knowledge are represented uniformly, they can be accessed, integrated, and disambiguated by a single mechanism.

A second claim the interpreter embodies is that sentence processing is fundamentally knowledge-intensive and expectation-based. The representation and integration of constructions uses many diverse types of linguistic knowledge. Similarly, the access of constructions is sensitive to top-down and bottom-up, syntactic and semantic knowledge, and the selection of constructions is based on coherence with grammatical knowledge and the interpretation.

Advisor: Robert Wilensky


BibTeX citation:

@phdthesis{Jurafsky:CSD-92-676,
    Author = {Jurafsky, Daniel},
    Title = {An On-Line Computational Model of Human Sentence Interpretation: A Theory of the Representation and Use of Linguistic Knowledge},
    School = {EECS Department, University of California, Berkeley},
    Year = {1992},
    Month = {Mar},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/1992/6131.html},
    Number = {UCB/CSD-92-676},
    Abstract = {This dissertation presents a model of the human sentence interpretation process, which attempts to meet criteria of adequacy imposed by the different paradigms of sentence interpretation. These include the need to produce a high-level interpretation, to embed a linguistically motivated grammar, and to be compatible with psycholinguistic results on sentence processing. <p>The model includes a theory of grammar called Construction-Based Interpretative Grammar (CIG) and an interpreter which uses the grammar to build an interpretation for single sentences. An implementation of the interpreter has been built called Sal. <p>Sal is an on-line interpreter, reading words one at a time and updating a partial interpretation of the sentence after each constituent. This constituent-by-constituent interpretation is more fine-grained and hence more on-line than most previous models. Sal is strongly interactionist in using both bottom-up and top-down knowledge in an evidential manner to access a set of constructions to build interpretations. It uses a coherence-based selection mechanism to choose among these candidate interpretations, and allows temporary limited parallelism to handle local ambiguities. Sal's architecture is consistent with a large number of psycholinguistic results. <p>The interpreter embodies a number of strong claims about sentence processing. One claim is uniformity, with respect to both representation and process. In the grammar, a single kind of knowledge structure, the grammatical construction, is used to represent lexical, syntactic, idiomatic, and semantic knowledge. CIG thus does not distinguish between the lexicon, the idiom dictionary, the syntactic rule base, and the semantic rule base. Uniformity in processing means that there is no distinction between the lexical analyzer, the parser, and the semantic interpreter. Because these kinds of knowledge are represented uniformly, they can be accessed, integrated, and disambiguated by a single mechanism. <p>A second claim the interpreter embodies is that sentence processing is fundamentally knowledge-intensive and expectation-based. The representation and integration of constructions uses many diverse types of linguistic knowledge. Similarly, the access of constructions is sensitive to top-down and bottom-up, syntactic and semantic knowledge, and the selection of constructions is based on coherence with grammatical knowledge and the interpretation.}
}

EndNote citation:

%0 Thesis
%A Jurafsky, Daniel
%T An On-Line Computational Model of Human Sentence Interpretation: A Theory of the Representation and Use of Linguistic Knowledge
%I EECS Department, University of California, Berkeley
%D 1992
%@ UCB/CSD-92-676
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/1992/6131.html
%F Jurafsky:CSD-92-676