Linguistics 362
Intro to NLP
Autumn 2005

Course goals

This course will introduce students to the basics of natural language processing (NLP), a field which combines insights from linguistics and computer science to produce applications like machine translation, information retrieval, and spell checking.

We will cover a range of topics that will help students understand how current NLP technology works and will provide students with a platform for future study and research. We will move from simple representations of language, such as finite-state techniques and n-gram analysis, to more advanced representations, such as those found in context-free and unification-based parsing.

Students who take this course will gain a thorough understanding of the fundamental methods used in natural language understanding, along with an ability to assess the strengths and weaknesses of natural language technologies based on these methods.

Instructor:

Markus Dickinson

Office:

Intercultural Center (ICC) 452

Phone:

(202) 687-5753

E-mail:

mad (followed by) 87 AT georgetown DOT edu

Office hours:

(at least for the first week)

M 2:00-3:00pm
R 10:30-11:30am
  or by appointment


Meeting time:

MW, 11:40-12:55

Classroom:

Car Barn (CBN) 301

Course website:

http://www9.georgetown.edu/faculty/mad87/05/nlp/

Credits:

3

Course prerequisites:

None (open to all upper class and graduate students)

Readings:

Course requirements:

Academic Misconduct:

As signatories to the Georgetown University Honor Pledge, and simply as good scholars and citizens, you are required to uphold academic honesty in all aspects of this course. You are expected to be familiar with the letter and spirit of the Standards of Conduct outlined in the Georgetown Honor System and on the Honor Council website. As faculty, I too am obligated to uphold the Honor System, and will report all suspected cases of academic dishonesty.

Students with Disabilities:

Students who need an accommodation based on the impact of a disability should contact me to arrange an appointment as soon as possible to discuss the course format, to anticipate needs, and to explore potential accommodations.

I rely on the Academic Resource Center for assistance in verifying the need for accommodations and developing accommodation strategies. Students who have not previously contacted the Academic Resource Center are encouraged to do so (202-687-8354; http://ldss.georgetown.edu/index.html).

Schedule:

Please note: To save paper, please do the following when printing power point (.ppt) slides:

Month Week Day Date Topic Reading Assignments
Aug. 1 W 31 Intro to class    
Sep. 2 M 5 LABOR DAY, NO CLASS    
    W 7 Overview (.ppt) ch. 1  
  3 M 12 Regular expressions & Automata (.ppt) ch. 2  
    W 14 Regular expressions & Automata    
  4 M 19 Morphology & Finite-State Transducers (FSTs) (.ppt) ch. 3  
    W 21 Morphology & FSTs   HW1 due
  5 M 26 Spelling checking (.pdf) ch. 5  
    W 28 Spelling/N-grams (.pdf) ch. 6  
Oct. 6 M 3 N-grams    
    W 5 Part-of-speech (POS) tagging (.ppt) ch. 8 HW2 due
  7 M 10 COLUMBUS DAY, NO CLASS    
    W 12 POS tagging app. D  
  8 M 17 POS tagging    
    W 19 POS tagging   HW3 due
  9 M 24 Context-Free Grammars (CFGs) (.pdf) ch. 9  
    W 26 CFGs    
  10 M 31 CFGs & Parsing (.ppt) ch. 10  
Nov.   W 2 CFGs & Parsing   HW4 (Perl) due
  11 M 7 Unification-based parsing (.ppt) ch. 11  
    W 9 Unification-based parsing    
  12 M 14 Probabilistic parsing (.ppt) ch. 12  
    W 16 Probabilistic parsing   HW5 due
  13 M 21 Semantics/meaning (.pdf) ch. 14  
    W 23 Semantic analysis ch. 15  
  14 M 28 Word Sense Disambiguation (.pdf) ch. 17  
    W 30 Word Sense Disambiguation   HW6 due
Dec. 15 M 5 Wrap-up    
    W 7 Project Presentations (Description)    
  16 F 16 Written projects due    

Disclaimer

This syllabus is subject to change. All important changes will be made in writing, with ample time for adjustment.

About this document ...

This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.70)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 syllabus.tex

The translation was initiated by Markus Dickinson on 2005-08-30


Markus Dickinson 2005-08-30