Final Exam Preparation
-
The final exam is scheduled for Thursday, Dec. 12 from
4:00 to 6:00.
-
Like the midterm exam, this exam will emphasize applying the ideas you have studied
to hypothetical sitautions, rather than, say, explaining the details of one
researcher's approach to a problem, or performaing lots of calculations. That doesn't
mean you shouldn't review the various approaches to issues we've covered, and there will
certainly be some calculation involved, but I'm most interested in assessing how
you have synthesized the material and how you can apply it to problems.
-
The exam is "closed book"; that is, no books or other study aids are permitted.
You may use a calculator, though I will design any problems requiring calculation
so that a calculator won't be necessary.
-
Any mathematical formulas that you might need, apart from the simplest, will be on
the test, so you don't need to worry about memorizing them, but should concentrate
on understanding them.
-
In general, anything in the readings or lectures throughout the course is possible material
for this exam. But the emphasis will definitely be on the material from October and
November. That said, it would probably be helpful to review some of the probability-based
methods we discussed in detail in the first few weeks, because these methods are applicable
to many of the areas we've covered since.
-
The midterm exam preparation may be helpful for reviewing
material from the first part of the course.
-
Here's a list of some topics from October and November lectures and readings that are
potential exam material (clearly, the exam will not involve all of them!). I cannot absolutely
guarantee that this is a comprehensive list; I've created it by looking through
the lecture notes and the readings. Note that near the end of each chapter of Jurafsky
and Martin, there is a summary of the main points of the chapter, which you may find helpful.
- Linguistic concepts
- Review linguistics terminology, if you're not comfortable with it.
This includes branches of linguistics,
parts of speech, grammatical relations/roles (subject, object, complement,
modifier), types of constituents (noun phrases, verb phrases, etc.),
morphological terms (inflection, derivation, stem, etc.). selectional
restriction, word sense, collocation, idiom, anaphor, referent, discourse structure
and relations, tense, and aspect.
- As you probably sense by now, ambiguity is of central importance in
computational linguistics. Be sure you understand why it is important. What are the types of
amibguity? What are some methods for resolving various types of ambiguity?
- Concepts and methods in computational linguistics:
-
Parsing
- What is parsing? How does it differ from part-of-speech tagging?
- Context free grammars and the notations for writing grammar rules
- How is a parser different from a grammar?
- Top-down and bottom-up parsing strategies; their respective advantages
and drawbacks; recursion
- Types of syntactic (structural) ambiguity and how they can cause problems
for parsers
- How can parser performance be evaluated?
- More on syntactic ambiguity resolution
- Prepositional phrase attachment ambiguity
- Symbolic and statistical approaches to resolving PP-attachment ambiguity
(and other types of structural ambiguity)
- Why is resolving this sort of ambiguity important (usually)? When does it
matter less if ambiguities are resolved?
- Semantic interpretation
- Semantic relations in Wordnet and MindNet
- Compound nouns: why do they pose problems in interpretation and what
strategies are available for inferring the relationships between their component
words?
- Referent resolution: types of expressions that can refer (pronouns and
demonstrative, definite NPs, functionally-dependent NPs), lexical, syntactic
and semantic factors in referent resolution, and how these are incorporated
into computational strategies
- Tense and aspect; Reichenbach-style representation of tense
- Abduction, and how it can be used as a strategy for interpretation; what
difficulties are involved in using abductive methods extensively?
- Discourse structure, dialogue, and coherence relations
- How is discourse structured? What are some unique features of dialogue?
- What are the main kinds of coherence relations and how are they used in
modeling discourse structure?
- How are coherence relations signalled in discourse?
- What strategies are used for determining coherence relations and discourse structure
in computational systems?
- What are dialogue acts and how are they related to the structure of dialogues?
What's the differnce between locutionary and illocutionary acts?
- Computational models of dialogue; dialogue manager; models of beliefs, desires,
and intentions
- Summarization and generation
- Extracts vs. abstracts; goals of summarization
- Determining what is important: use of cue phrases, location of sentences in
documents, and statistical salience of terms
- Potential problems in extraction: missing referents, lack of coherence
- What problems arise in natural language generation that are less important in
natural language understanding?
- What are the components of a typical NLG system?
- What is meant by lexical choice, content selection, discourse planning, and
surface realization, and where do these procedures take place? Why is it not a
simple issue to order them?
- Multilingual generation: how does it differ from machine translation, and when
is it an appropriate alternative?
- Machine translation
- How does machine translation involve both natural language understanding and
natural language generation?
- Three approaches or levels of MT (the "pyramid diagram"): direct, transfer,
and interlingual; comparative advantages and disadvantages of each
- How do differences between languages make MT difficult? Lexical mismatches,
morphological and grammatical differences
- Ambiguity in MT: when should it be resolved and when can it be safely
preserved?
- How is vagueness (lack of specificity) in one language often a problem in
translating to another?
- Statistical methods in MT: alignment of bilingual corpora; tradeoffs between
faithfulness and fluency
- What characteristics of a domain and a task determine whether an MT system
is likely to be useful?
(return to syllabus)