Midterm Preparation
-
The midterm is scheduled for Wednesday, Oct. 16 and will take up most of the 75 minutes
of class time.
-
The midterm is "closed book"; that is, no books or other study aids are permitted.
You may use a calculator, though I will design any problems requiring calculation
so that a calculator won't be necessary.
-
Any mathematical formulas that you might need, apart from the simplest, will be on
the test, so you don't need to worry about memorizing them, but should concentrate
on understanding them.
-
In general, anything in the readings or lectures through Oct. 9 is possible material
for this exam. But you can be assured that issues pertaining to the foundations
of probability theory or other formalisms that aren't directly applicable to the
computational linguistic methods we have covered will not appear on the exam.
Also, the readings at the beginning of the course that discuss the history and
philosophy of the field will figure scantily, if at all.
-
Here's a list of some topics that are potential exam material. I cannot absolutely
guarantee that this is a comprehensive list; I've created it by looking through
the lecture notes and the readings.
- Linguistic concepts
- You should know, as useful background information, what some of the
various subfields of linguistics are: phonetics, phonology, morphology,
syntax, semantics, and pragmatics.
- Likewise, you should feel comfortable with terminology such as the
parts of speech, grammatical relations (subject, object, complement,
modifier), types of constituents (noun phrases, verb phrases, etc.),
morphological terms (inflection, derivation, stem, etc.). selectional
restriction, word sense, collocation, and idiom.
- What is ambiguity? Why is it important in CL? What are the types of
amibguity? What are some methods for resolving various types of ambiguity?
- Mathematical and computational concepts:
- probability
- independent events
- conditional probability
- Bayes' rule
- logarithms (you don't need to calculate them but you should understand their
properties)
- vectors
- (abstract) space
- magnitude and direction
- components
- dot product of two vectors: definition and calculation
- angle between two vectors, cosine
- algorithm (you should have some intuitive understanding of what
an algorithm is)
- Concepts and methods in computational linguistics:
- symbolic and statistical methods in CL
- natural language understanding and natural language gneration
- lexicons, and the kinds of information stored in them
- finite-state transducers
- stemming
- term expansion
- taxonomy (ontology)
- methods for finding collocations
- part-of-speech (POS) tagging, parsing
- ambiguity resolution (disambiguation)
- information retrieval
- vector space representation of documents and queries
- term frequency and inverse document frequency
- precision and recall
- question-answering systems
- metadata, and the importance of various kinds of metadata
(return to syllabus)