Exercise 9, Grammar Development
Adding a Finite-State Morphological Analyzer
Preliminaries
In this exercise, you will practice integrating a morphological
analyzer into your grammar. Up to date, you have been working with a
full form lexicon. This means you have full control over the
lexical entries within your grammar, but it is also very tedious as
you have to write a separate lexical entry for each inflected (or
derived) form.
In this exercise we will work with a version of the finite-state
morphological analyzer that is part of the English ParGram grammar.
It is called english.infl.patch.full.fst and can be found in
the "prelex" folder of the English ParGram grammar. The English
ParGram grammar is available to you freely via the XLE license.
Alternatively, you can also build your own finite-state
morphological analyzer and hook it up, or use a different type of
analyzer altogether. Details as to the morphology-grammar interface
can be found in the Starter
Notes and in the XLE
Morphology Section.
You can use grammar9.lfg as a model. We
have already integrated verbs together in this file in classe.
You. will also need morph-lex.lfg, morph-rules.lfg, MCONFIG.lfg.
Extending the Grammar: Nouns and Adjectives
grammar9.lfg already interacts with the morphological analyzer with
respect to verbs.
Now expand the grammar so that nouns and adjectives are also coming
out of the morphological analyzer.
Generally proceed in the following way:
- grammar9.lfg already has the morph-lex.lfg and morph-rules.lfg
files integrated unter "FILES" in the configuration file and the
headings (MORPH ENGLISH) have been added in "LEXENTRIES" and
"RULES". This needs to be done to ensure that the new files are
indeed considered part of the grammar. Note also that all the relevant files (morph-lex-.lfg,
morph-rules.lfg and MCONFIG.lfg) need to be in the same directory as the
grammar (or you can specify the paths to where they are).
- In this exercise, we are working with a morphological analyzer that
is a "black box" for us. That is, we know what the input is, but we don't know the inner workings of the morphological
analyzer. In order to see what the output of the morphological
analyzer is from within XLE, type "morphemes some-word". For example:
% morphemes bananas
analyzing {bananas}
{bananas "+Token"|banana "+Noun" "+Pl"}
- If this works, it is a sign that the morphological analyzer is
part of the grammar.
- The "morphemes" command shows oen what the output of the
morphological analyzer is. Use this knowledge to integrate the
relevant information into the grammar.
- Make sure you have an entry for all the tags that are
produced as output in morph-lex.lfg, i.e., "+Noun" and "+Pl" in the
example above. If not, add the missing ones in.
- Decide what functional information you want associated with any
given tag. Where possible, use existing templates from your
grammar.
- You can also decide to have no information associated with a
tag, for example: "+Verb V-POS XLE ."
- Now write sublexical rules that can parse all the tags in the
right order (morph-rules.lfg).
You should make sure that the following sentences work, with the nouns and the verbs
coming from the morphological analyzer:
- Sophie educated the bright robot.
- Lazy leopards sleep.
- The curious leprechaun stole beer in Ireland.
Note:
- For verbs you need to specify the head word (lemma) and the
relevant subcatgorization information in the lexicon.
- For nouns and adjectives, this is not necessary, as the "unknown" guesser in
the morph-lex.lfg file guesses words not in the lexicon to be either
nouns or adjectives. So, unless you wanted to specify extra
information for a particular lemma, you do not need to have extra
entries for nouns and adjectives in your lexicon. Try deleting (or
commenting out) all the ones you have entered and see if your
testsuite still works.
Projects
Remember to keep thinking and working on your projects.
Please submit your exericses and your testsuite to Maike Müller (uni
konstanz Addresse) by 31.6.2014 at 10 am.
Relevant Reading Material
The Grammar Writer's Cookbook, Ch. 12
Kaplan, Ron, John
T. Maxwell III, Tracy Holloway King and Richard Crouch. 2004. Integrating
Finite-state Technology with Deep LFG Grammars. In Proceedings of
the ESSLLI04 Workshop on Combining Shallow and Deep Processing for
NLP.
Starter
Notes
XLE
Morphology Section