This course presents an overview of the field of computational linguistics with an emphasis on its applications. Some major applications include information retrieval, question-answering systems, and machine translation; we will touch on others also.
Though no programming is required, an acquaintance with programming, algorithms, mathematics, statistical methods, or logic will be helpful at various points. Some familiarity with linguistics is assumed as well. Students will complete a small project, and there will be homework assignments, a midterm exam, and a final exam.
Each student will complete a small project drawing on the themes and techniques we will study in the course. The project might involve evaluating or comparing existing tools (for example, machine translation tools available on the web or document categorization systems), a research paper extending beyond the material covered in class (for example, methods of probabilistic parsing or named entity extraction), or programming and implementing a small system.
I will meet with each of you midway through the course to discuss your project.
Towards the end of the course, everyone will hear you give a presentation of your
project (probably about 15 or 20 minutes long).
| Week 1 | Wed., Aug. 28 | What is Computational Linguistics? What makes CL hard? |
Winograd, ch. 1, and Manning and Sch|tze, ch. 1 |
| Week 2 | Mon., Sept. 2 | Labor Day Holiday (no class, but continue with reading) |
Jurafsky and Martin, ch. 1 and look at HyperStat Online (an introduction to probability), particularly the links in the Contents column |
| Wed., Sept. 4 | Applications of CL Symbolic and statistical approaches to CL Some basics of probability and statistics |
Jurafsky and Martin, ch. 2 first homework assignment | |
| Week 3 | Mon., Sept. 9 | Regular Extressions and Finite-State Automata Morphology and Finite State Transducers |
Jurafsky and Martin, ch. 3 |
| Wed., Sept. 11 | Information Retrieval: Introduction | Manning and Sch|tze, ch. 15, sec. 1 & 2 (portion in reader) Jurafsky and Martin, ch. 17, sec. 3 interpolated homework assignment (Jurafsky and Martin, problems 2.4 and 2.8) | |
| Week 4 | Mon., Sept. 16 | Information Retrieval: the vector space model Search Engines by Tim Sibley "Precision Content Retrieval" |
Ambroziak and Woods (1998), Flank (2000) first homework assignment due |
| Wed., Sept. 18 | N-Grams and Collocations | Jurafsky and Martin, ch. 6, sec. 1 & 2 (pp.191-206) Manning and Sch|tze, ch. 5, through section 5.2 second homework assignment interpolated homework assignment due | |
| Week 5 | Mon., Sept. 23 | Collocations (continued) |
Manning and Sch|tze, ch. 5, section 5.3 to end |
| Wed., Sept. 25 | Part of Speech Tagging | Jurafsky and Martin, ch. 8 | |
| Week 6 | Mon., Sept. 30 | Context-free Grammars and Parsing | Jurafsky and Martin, ch. 10 (also look over ch. 9 if you aren't familiar with context-free grammars) second homework assignment due |
| Wed., Oct. 2 | Question-Answering Sytems |
Cardie et al. (2000), Molla Aliod, Berri, and Hess (1998) | |
| Week 7 | Mon., Oct. 7 | Lexical Ambiguity Resolution |
Resnik and Yarowsky (1997) or pdf here Pedersen (2001) |
| Wed., Oct. 9 | more on Lexical Ambiguity Resolution Midterm Review |
(continued from previous lecture) | |
| Week 8 | Mon., Oct. 14 | Columbus Day Holiday -- no class | review for midterm exam |
| Wed., Oct. 16 | Midterm exam | review for midterm exam | |
| Week 9 | Mon., Oct. 21 | Syntactic Ambiguity Resolution |
Hindle and Rooth (1993) Whittemore, Ferrara, and Brunner (1990) |
| Wed., Oct. 23 | Semantics and Interpretation: Lexical Semantics Barrett, Davis, and Dorr PowerPoint slides |
Barrett, Davis, and Dorr (2001), Richardson, Dolan, and Vanderwende (1998) third homework assignment | |
| Week 10 | Mon., Oct. 28 | Referent Resolution | Jurafsky and Martin, ch. 18 |
| Wed., Oct. 30 | Discourse and Dialog Systems | Jurafsky and Martin, ch. 19 third homework assignment due | |
| Week 11 | Mon., Nov. 4 | Document Summarization | Hahn and Mani (2000), Buyukkokten, Garcia-Molina, and Paepcke (2001) project proposal due |
| Wed., Nov. 6 | Generating Natural Language Demo of the Nitrogen NLG system from ISI |
Jurafsky and Martin, ch. 20 | |
| Week 12 | Mon., Nov. 11 | Machine Translation Bonnie Dorr's slides |
Jurafsky and Martin, ch. 21 final homework assignment: exercises 18.11, 19.4, and 21.5 in Jurafsky and Martin |
| Wed., Nov. 13 | Machine Translation (reprise) See MultiMeteo's weather forcasts in several languages |
Coch and Chevreau (2001) the MultiMeteo website | |
| Week 13 | Mon., Nov. 18 | Putting it all together: The Verbmobil project; |
Malouf and Riehemann's slides |
| Wed., Nov. 20 | Multilingual Information Retrival | Flank (2000b)
work on projects | |
| Week 14 | Mon., Nov. 25 | StreamSage demo | work on projects |
| Wed., Nov. 27 | Class Presentations | work on projects final homework assignment due | |
| Week 15 | Mon., Dec. 2 | Class Presentations; Review | (these are optional) Multilingual Information Management: Current Trends and Future Abilities (the introduction, ch. 1 (you can skip sec. 2), ch. 2, and ch. 6, sec. 4 are probably most interesting) Nunberg (2000) |
| Wed., Dec. 4 | Wrap-up and Review: The Future of CL | review for final exam project write-ups due | |