distinguishing lexical entries

About this list Date view Thread view Subject view Author view

Gosse Bouma (gosse@let.rug.nl)
Wed, 29 Jan 1997 13:50:39 +0100 (MET)


Bob Levine writes: > So apparently the elimination of traces in favour of a > valence-reduction mechanism has brought us to the point where what > would once upon a time have been a set of separate lexical entries now > consists of a single ur-entry that can be fleshed out in accordance > with a potentially infinite set of dijunctive constraints. But once > you start going down that route, what *principled* reason do you have > for ever distinguishing between two lexical entries, ever? 1. I don't think it is so strange to define the lexicon using constraints, and using `ur-entries' or whatever. I fact, we have been doing this all along. A consequence of this methodology is that the question what constitutes a single lexical entry may be hard to decide. 2. This does not mean there can be no linguistic criteria for deciding for or against a certain method of defining lexical entries. ad 1. HPSG is a constraint-based theory of grammar. We define rules, rule schemata, principles, constraints, sort hierarchies, and whatever else we think is necessary, which in the end DENOTE a certain set of objects. The lexicon consists of a hierarchy of (lexical) sorts, lexical rules, and (recursive) constraints, which together denote the set of objects which we believe correspond in some way to the set of words or lexical signs in a given language. Since constraints only partially describe an object, it may be hard to determine the cardinality of the set of objects which is usually referred to as the lexicon. Of course, one can try to count the number of maximally specific objects which satisfy the constraints, but I doubt whether that is of much linguistic significance. Alternatively, one could try to distinguish lexical entries at the description level. I.e. instead of counting objects, one counts the number of disjuncts in a formula such as Word = W1 or W2 or W3 or ..... Wn or the number of leaves in a lexical inheritance hierarchy. But this method of counting seems even less linguistically relevant. Using disjunction, for instance, I can in principle combine the disjuncts W1 and W2 into a single W12 = (W1 or W2). Or in an inheritance hierarchy, I might have a sort for transitive verbs, which has kiss and hit as subsorts, subcases, or whatever: TV / \ .... kiss hit If I redine this as: | phon #1 | | comps < np > | <- ( #1 = kiss and #2 = kiss') or | subj < np > | ( #1 = hit and #2 = hit') or ... | cont #2 | does that mean I have only one lexical entry for TV's left ? I don't think that it makes much sense to try to answer such questions. Note also that the introduction of recursive constraints is not essential here. As soon as entries can be specified using disjunction or inheritance, or even rules, entries will be related, and the question what constitutes a separate entry may be difficult to answer. ad 2. This does not mean anything goes, of course. The whole idea of using inheritance, disjunction, constraints, rules, etc., is that they are useful for capturing generalizations. This is the business of linguistics, and with that purpose in mind, one can still argue for or against certain analyses. For instance, in defining the lexicon using inheritance, one usually tries to capture subsets of the set of lexical entries which have something in common. There are many ways to carve up the lexicon into groups, classes, sorts, or whatever, but some ways of doing it will lead to more succinct descriptions than others. Another type of argument could come from unlike constituent coordination. As Gertjan van Noord pointed out to me (the point seems to be observed first in a paper by Richard Cooper (EACL91)), if you define copular `be' as two separate lexical entries (once selecting an NP and once selecting an AP), it might become very hard to account for sentences such as `he is a republican and proud of it', etc. If `be' selects for an (NP or AP) complement, or a [+Pred] complement, we have the beginning of an explanation. [But perhaps, not even this counts as an argument. In some versions of CG, one can probably prove that (VP/NP and VP/AP) is equivalent to VP/(NP or AP), and thus that the whole distinction is artificial to begin with.] Here are a few other remarks loosely related to the points above. 3. The discussion so far has focussed on lexical rules/ constraints for extraction and adverb selection. There is by now, however, a considerable amount of literature in HPSG on complement inheritance verbs (for German, French, Japanese, Dutch, etc). For instance, the comps list of german auxiliaries and modals could be defined as: | comps #1 (+) < v[comps & #1]> | These verbs do not have a comps list of a fixed size. Does this analysis therefore imply that there are an infinite number of different lexical entries for `haben', `wollen', etc in German? (Note also that it is more or less accidental that these entries are defined using a constraint. In CG, for instance, the same effect is achieved using a (lexical) rule of division.) 4. The situation in the lexicon is not that much different from what most of us seem to see as perfectly reasonable for syntax. In GPSG, many different rules were used to describe all expansions of VP. In HPSG one subsumes all these different rules under one `rule schema' (`ur-rule'), which accounts for a few other constructions as well. And of course, there is no reason to stop where HPSG stops. Given a sufficiently general statement of the valency principle, and given one or two additional parameters in the rule schema, all head-complement/subject/specifier rules could be reduced to a single rule. In fact, I have seen HPSG grammars using only a single rule for head-valence as well as head-filler structures .... Or, given a slightly different organization of the signs, one might reduce everything to functor-argument cancellation (and start doing CG). Again, one should not worry about the fact that this makes it hard to distinguish rules, but about the fact whether this leads to interesting generalizations. Gosse. -- Gosse Bouma, Alfa-informatica, RUG, Postbus 716, 9700 AS Groningen gosse@let.rug.nl tel. +31-50-3635937 fax +31-50-3636855


About this list Date view Thread view Subject view Author view

This archive was generated by hypermail 2.0b3 on Fri Dec 18 1998 - 20:36:03 PST