An Experiment in Enhancing Information Access by Natural Language Processing

Isaac Cheng and Robert Wilensky

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-97-963
July 1997

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-963.pdf

We explore the hypothesis that lexical disambiguation could be applied to provide useful information access services. Specifically, we refined a lexical disambiguation method, and used it in a fully automatic categorization algorithm we developed. We also used this method more directly, to implement a service that retrieves documents by word sense.

To test these algorithms, we developed an experimental system, IAGO!, in which we applied these algorithms to accessing the World Wide Web. IAGO! comprises both an Web directory (i.e., a classification of articles by topic) and a Web search service. Unlike most other Web directories, IAGO!'s directory was generated by a fully automatic process. One experiment shows a cataloging accuracy of 97%.

To improve net searching, IAGO! enables users to refine their queries by first detecting lexical ambiguities, and then allowing users to select specific word senses by which to search. IAGO! returns only Web pages in which a given keyword occurs in the specified sense. To help evaluate these results, we derive some performance thresholds that a disambiguation algorithm needs to operate within in order to be useful for retrieval. Our experimental results suggest that the implemented algorithm is performing well above these needs.


BibTeX citation:

@techreport{Cheng:CSD-97-963,
    Author = {Cheng, Isaac and Wilensky, Robert},
    Title = {An Experiment in Enhancing Information Access by Natural Language Processing},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1997},
    Month = {Jul},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/5489.html},
    Number = {UCB/CSD-97-963},
    Abstract = {We explore the hypothesis that lexical disambiguation could be applied to provide useful information access services. Specifically, we refined a lexical disambiguation method, and used it in a fully automatic categorization algorithm we developed. We also used this method more directly, to implement a service that retrieves documents by word sense. <p>To test these algorithms, we developed an experimental system, IAGO!, in which we applied these algorithms to accessing the World Wide Web. IAGO! comprises both an Web directory (i.e., a classification of articles by topic) and a Web search service. Unlike most other Web directories, IAGO!'s directory was generated by a fully automatic process. One experiment shows a cataloging accuracy of 97%. <p>To improve net searching, IAGO! enables users to refine their queries by first detecting lexical ambiguities, and then allowing users to select specific word senses by which to search. IAGO! returns only Web pages in which a given keyword occurs in the specified sense. To help evaluate these results, we derive some performance thresholds that a disambiguation algorithm needs to operate within in order to be useful for retrieval. Our experimental results suggest that the implemented algorithm is performing well above these needs.}
}

EndNote citation:

%0 Report
%A Cheng, Isaac
%A Wilensky, Robert
%T An Experiment in Enhancing Information Access by Natural Language Processing
%I EECS Department, University of California, Berkeley
%D 1997
%@ UCB/CSD-97-963
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/5489.html
%F Cheng:CSD-97-963