Head-Driven Phrase Structure Grammar

Precision Grammars for Human Languages

In isolation, linguistic expressions are often ambiguous, incomplete, or semantically indeterminate. But in context, these same expressions are used by speakers to encode vivid and unambiguous messages that hearers are routinely able to interpret in an incremental fashion (perhaps even syllable-by-syllable). Given the complexity of the grammatical systems that underlie communication and the myriad nonlinguistic factors that affect the resolution of ambiguous and uncertain utterances, it is quite puzzling that efficient and accurate communication is possible using human language.

Given the pervasive ambiguity and indeterminacy of human languages, how can it be that we use them with so little difficulty to convey the precise messages that we routinely do when we talk to each other? Our contribution to the resolution of this puzzle is a psycholinguistically realistic architecture for the grammar of human languages. Our constraint-based, lexicalist grammars are designed to play a central role in highly integrative models of language processing and communication.

Instead of transformational derivations (the sequential manipulation of complete sentential structures commonly assumed in linguistic analysis), Head-Driven Phrase Structure Grammar (HPSG) is formulated in terms of order-independent constraints. These constraints provide partial grammatical information that can be flexibly consulted in a variety of language processing models based on the notion of incremental, on-line integration of heterogeneous types of information. Indeed, recent psycholinguistic evidence has confirmed this picture of human language processing (for recent surveys, see Tanenhaus and Trueswell 1995 and MacDonald et al. 1994) and points directly to the conclusion that psycholinguistically realistic grammars are systems of flexible constraints, rather than structural manipulations.

Also emerging from the recent psycholinguistic literature is the observation that human linguistic sentence processing has a powerful lexical basis. Put simply, words are information-rich; hence certain key words play a pivotal role in the processing of the clauses that contain them. This simple observation is central to HPSG theory, whose notion of phrase structure is built around the concept of a lexical head---a single word whose dictionary entry specifies information that determines crucial grammatical properties of the phrase it projects. This includes part of speech (P-O-S) information (nouns project noun phrases, verbs project sentences, etc) and dependency relations (all verbs require subjects in English, but verbs differ systematically as to whether they select direct object complements, clausal complements, and so forth). Lexical heads also encode key semantic information that is shared with their phrasal projections.

The general theoretical background for much current descriptive and computational work in HPSG, both at CSLI and elsewhere, is presented in considerable detail in Pollard and Sag (1994) and Sag and Wasow (in press).

Although words in HPSG are information-rich, the detailed lexical entries of HPSG are concisely expressed within a multiple inheritance hierarchy. Such hierarchical lexicons allow cross-cutting generalizations about words to be expressed in a highly efficient and compact organization. Much of our project's research activity is at present dedicated to issues of morphological (word-internal) structure and lexical generalizations, e.g., in English (Flickinger 1987, Davis 1996, Bouma et al. 1998, Malouf 1998), German (Riehemann 1994), French (Kim and Sag 1995, Miller and Sag 1997, Abeillé et al. 1999, Abeillé et al. 1998), Korean (Kim 1995, Bratt 1996, Lee to appear, Sells to appear), Japanese (Manning, Sag, and Iida in press, Manning and Sag in press), and West Greenlandic (Malouf 1999).

A new thread of research in HPSG at Stanford, which has stimulated intense interactions with research at UC Berkeley, is developing the closely allied framework of Construction Grammar. Considerable research is now underway, in our project and in various ongoing collaborations, that explores the adaptation of multiple inheritance hierarchies to the grammar of phrasal constructions as well. Recent efforts in this vein include Sag (1997), Malouf (1998), Riehemann (1997), Sag et al. (forthcoming), Ginzburg and Sag (1998), and Abeillé et al. (ms).

In addition, our project has extensive interaction with the international HPSG community and is actively engaged in a wide range of linguistic investigations in diverse languages. A few of the theoretical issues currently being explored by our project are: the elimination of phonetically unrealized constituents from syntactic descriptions, the exploration of multiple inheritance construction hierarchies, the role of underspecified representations in semantic theory, and the use of default inheritance in grammatical descriptions.

Theoretical and analytic research in HPSG at CSLI and other sites has spawned an international interest in language processing computing technology incorporating HPSG grammars and lexicons. At present, HPSG-related system development is ongoing in numerous university and industrial settings in the U.S., Canada, Japan, Korea, Western Europe and Australia. Perhaps the largest of these, the Verbmobil Project initiated by the German Government in 1993, involves collaboration among researchers at over thirty institutions within Germany. In 1994, CSLI's Linguistic Grammars Online Project contracted to join this effort, creating a direct avenue for immediate applications of some of the HPSG Project's theoretical results.

Ivan Sag Stanford Linguistics (Project Leader);
Ash Asudeh, Stanford Linguistics student;
Emily Bender, Stanford Linguistics student;
Brady Clark, Stanford Linguistics student;
Dan Flickinger, CSLI;
Andreas Kathol, UC Berkeley;
Rob Malouf, Research Associate;
Susanne Riehemann, Stanford Linguistics student;
Tom Wasow, Stanford Linguistics.
Last modified: 1998