Ling 361, Intro to Computational Linguistics
fall 2002

Homework assignment 1

due at beginning of class, 11:40 am, Monday, Sept. 16

If you wish, you may e-mail this assignment to me as plain text or as an attachment in Microsoft Word or Rich Text Format (RTF).

1. Manning and Schütze, in section 1.4.4, discuss some very simple methods for spotting collocations in texts. Briefly propose two ways in which you might improve on their methods. If you like, you may propose methods that would work for texts in a language other than English, and reasons why Manning and Schütze’s methods might not work well for that language.

2. The Baltimore Orioles have won about 47% of their games this year, while the Boston Red Sox have won about 57%.

a. Without knowing anything else about these teams or the rest of the baseball season, what's your estimate of the probability that both teams will win their next games?

b. What assumptions did you need to make to answer part a? What other information would permit you to make a better estimate of the probability?

3. As part of an NLP system you are designing, you are interested in determining the proper attachment of prepositional phrases (PPs). For example, in senence (1), the prepositional phrase is attached to (i.e., modifies) the verb, while in sentence (2) it is attached to the direct object:

(1) The college supplied free tickets to the students.
(2) The college supplied free tickets to the performance.

In a sample collection of documents, you have examined occurrences of 'supply' with a direct object and a following 'to' PP. You have discovered that the probability of each kind of attachment depends on the animacy of the object of the 'to' PP. Specifically, you have tabulated the following results:

Animate
object of PP
Inanimate
object of PP
Attachment to verb 45 15
to direct object 5 15

a. What is the conditional probability that a 'to' PP is attached to the verb, given that the object of the PP is animate?

b. In looking through a similar set of documents, you discover sentence (3):

(3) The clinic supplied the missing parts to the walkers.

in which 'walkers' has two possible senses, one animate, one inanimate. Which attachment of the PP is semantically plausible for each sense?

c. If we elect to exclude the semantically implausible interpretations of sentence (3), what is your estimate of the probability that 'walkers' has the inanimate sense here?

Back to the main syllabus page