LSA.111 HPSG: Optional Lab

Lab Instructions


Preliminary notes

This material is usually taught in a hands-on lab setting with the instructor present. Since I'm on the other side of the country, I will be present "virtually" through the bulletin board. I strongly encourage you to post questions there frequently, and read the answers to all the questions posted. Grammar engineering can be really cool, but to get to the cool part, it's better not to spend hours chasing down a syntax error. My rule of thumb is if you've pondered something for 10 minutes, then it's time to post.

Note that the implemented system is different from the textbook grammar in many ways. The underlying approach is the same, but do not be surprised in differences in feature geometry etc. When in doubt, ask! (Some of these differences come from the exigencies of implementation, some from the history of the resource we're working with, and some from simplifications made for this lab. An example of the latter is the fact that we're not positing any lexical rules in the lab, but rather just creating apparently unrelated entries in many cases.)

In addition, the LKB/Grammar Engineering FAQ might also prove useful. I recommend reading the guide to TDL syntax. (TDL stands for "type description language", and it's what LKB grammars are written in.)

This lab will step you through getting a starter-grammar from the Matrix web site, and then adding case and/or agreement. If your language has neither (overt) case nor agreement, please contact me (ideally on the bulletin board, or by email: ebender at u dot washington dot edu) and we can work out something else to add.


Preparation

The relevant software has been loaded onto the PCs in the computer classroom at MIT. The course page has instructors on how to install it on your on computer (Windows, Mac OS X, Linux).

The lab preparation instructions detail the information you need to collect about the language you choose to work with in order to complete the lab.

If you don't already know how to use emacs, I strongly encourage you to spend an hour working with the emacs tutorial. Run emacs, then select the tutorial from the help menu.


Download your starter package

As discussed in class, the Matrix contains a language-independent core as well as a collection of modules which allow you to customize it for certain properties of your language. Visit the Matrix configuration page to create and download a customized version of the Matrix.


Start up the LKB, and start parsing


Add some vocabulary


Parse again


Try the batch parse facility


Add case to your grammar

In what follows, when you add or modify a type, the default location is in your language-specific types file, which I will call esperanto.tdl. If you need to modify another file, the directions will say so explicitly.

If your language marks case via distinct forms of nouns

If your language marks case via different determiners

If your language marks case via different adpositions

All languages with case

Test your grammar


Add agreement to your grammar

Case agreement was described in the previous section. In this section, we'll focus on person/number/gender agreement. (If your language has still other kinds of agreement, contact me.) Unlike in the textbook, we'll be treating agreement as selectional restrictions (i.e., we won't be using the feature AGR on verbs). Furthermore, we'll be `housing' the agreement features on the value of INDEX, inside CONT (the equivalent of SEM).

Determiner-noun agreement

  • Create subtypes of determiner-lex which constrain the appropriate person/number/gender values inside their SPEC feature. For example, French la, the feminine singular determiner, says:
    [ SYNSEM.LOCAL.CAT.VAL.SPEC < [ LOCAL.CONT.HOOK.INDEX.PNG [ NUM sg,
                                                                GEND fem ]] > ].
    
  • If you already created determiner subtypes for case, cross-classify these with the png subtypes:
    fem-sg-nom-det-lex := fem-sg-det-lex & nom-det-lex.
    
  • Modify your entries for determiners in lexicon.tdl to inherit from your new types.
  • Reload your grammar and correct and syntax errors that are noted in the LKB window.

    Verb-subject or verb-object agreement

    Verbs agreeing with their subjects and/or objects constrain the PNG values of the items on their SUBJ and COMPS lists.

  • Create subtypes of verb-lex or transitive-verb-lex and intransitive-verb-lex which state the appropriate constraints. For example, an English grammar might have:
    3sg-verb-lex := verb-lex &
       [ SYNSEM.LOCAL.CAT.VAL.SUBJ < [ LOCAL.CONT.HOOK.INDEX.PNG [ PER third,
                                                                   NUM sg ]] > ].
    
    And this would be cross-classified with the transitive/intransitive distinction:
    3sg-trans-verb-lex := 3sg-verb-lex & transitive-verb-lex.
    
    (Note: The cross-classification gets a little awkard if you have both subject and object agreement, suggesting that lexical rules really are the way to go with this kind of phenomenon.)
  • Modify your verb entries in lexicon.tdl to instantiate your new agreement types. (A verb that is underspecified for agreement, like English slept can still instantiate a supertype, such as intransitive-verb-lex.)
  • Reload your grammar and correct and syntax errors that are noted in the LKB window.

    Test your grammar


    Build out your lexicon

    You are now in a position to build out your lexicon to full coverage of your test suite.


    Test your grammar


    Write up your results

    This lab is not graded, but I would be very interested to see your results. In addition, you may find that writing things up now is useful for your own future reference. If you are so inclined, please write up the following information, and submit it to me (ebender at u dot washington dot edu) along with your grammar and test suite:

    The Matrix grows by being challenged by new languages. I'll endeavor to send feedback on any lab write-ups I receive.


    Back to course page