Interpretation of compound nominals using WordNet
Leslie
Barrett, Anthony R. Davis
Semantic Data Systems
3029 Woodland Dr. NW
Washington, DC 20008 USA
lbarrett29@hotmail.com, tdavis@onebox.com
Bonnie
J. Dorr
Dept. of Computer Science
Univ. of Maryland
College Park, MD 20740 USA
Abstract. We describe an approach to interpreting noun-noun compounds within a question answering system. The systems lexicon, based on WordNet, provides the basis for heuristics that group noun-noun compounds with semantically similar words. The semantic relationship between the nouns in a compound is determined by the choice of heuristic for the compound. We discuss procedures for selecting one heuristic in cases where several can apply to a compound, the effects of lexical ambiguity, and some initial results of our methods.
1. Introduction
We describe an approach to interpreting noun-noun compounds (henceforth NNC) under development for use within a question-answering system. Because there is no explicit linguistic indication of the semantic relationship between the nouns in a compound, this is both a difficult problem and a task better suited to computational semantic methods than purely grammatical ones. Our approach uses a set of heuristics for classifying compounds according to the nouns in them, and associates each heuristic with a type of relationship between two entities. We determine which heuristics apply to a given compound by the classes of nouns specified in each heuristic and by a comparison of heuristics to choose the one that fits best according to criteria of semantic similarity. The WordNet system (Fellbaum (1998), http://www.cogsci.princeton.edu/~wn/) furnishes both a lexical database and a basis for computing measures of semantic similarity among nouns.
We believe that this work is important because it draws on a widely-used lexical resource and combines it with a pragmatic and implementable approach to interpreting NNCs. Apart from its usefulness in estimating semantic similarity, WordNet offers some direct advantages as a basis for interpretation, such as its hyponymy links, which mirror the relationships between the nouns in some compounds (e.g., elm tree, turtleneck sweater). We assign a heuristic, and then a semantic relationship, to an NNC by choosing the heuristic that is semantically closest to the compound. This two-stage process is desirable because most of the heuristics are designed to cover a specific semantic domain, but the same relationship may hold in NNCs form quite different domains (e.g., both wine glass and bird cage would be assigned the same semantic relationship, but would be covered by different heuristics). We have tested different measures of semantic distance, supplemented with frequency information for different senses of ambiguous nouns. The latter is important because many nouns are ambiguous and this interacts with the heuristic choice. We have tested our ideas on a set of several hundred NNCs found in text, using a set of over a hundred heuristics and about 25 semantic relationships. Though our implementation remains incomplete, we report some preliminary results that indicate promise.
Following a brief discussion of our goals and the problems we confront in reaching them, we compare our work with two other approaches to interpreting NNCs. One, due to Vanderwende (1995), posits a relatively small number of relationships that can be assigned to the nouns in a compound. We have found that, in the context of our system, this set is smaller and less precise than what we require. On the other end of the spectrum is the abduction-based view of interpretation advocated in Hobbs et al. (1993), in which the range of possible relationships is potentially limitless and complex, but which depends on extensive knowledge representation and inference mechanisms that we strive to avoid. We take an intermediate position on the number and extensibility of relationships that are useful for interpreting most NNCs (again, in the context of our question-answering system), and then explain how our WordNet-based heuristics are formulated. We close the paper with a discussion of our results and suggestions for further research.
2. Challenges of the task
Interpretation of NNCs is known to be difficult. As noted by Finin (1980), there can be arbitrarily many possible relationships between the nouns, each appropriate for a particular context. In brief, there is little explicit linguistic information to be exploited with no clearly predefined target set of relationships that we can assume. Also, compounds range from completely lexicalized to novel, on-the-fly, context-dependent formations, and from highly idiosyncratic semantically to more or less compositional; compounds may contain several nouns, bracketed ambiguously, and the individual lexical items in them are also frequently ambiguous. In this paper we do not address all of these matters. We confine ourselves to NNCs containing just two nouns, and we are concerned with semantically transparent ones, rather than the lexicalized, noncompositional ones (which we have lexicalized in numerous cases).
Some work on the problem has supposed a fairly small set of relations, somewhat like thematic roles or broad relationships of time and location. We have found this unsatisfactory for our purposes, however. There is no obvious set of discrete, distinct relations out there in the world, and we prefer not to define a set of relationships from abstract principles, but to be guided by potential usefulness in the domain of our system. At one extreme we could create a unique semantic relation between each pair of noun senses, but not all differences between relations are important. Deciding which ones are is a task that tends to be domain and application dependent, and we have let our intuitions guide us in developing the set of relations we use. Inevitably, this will result in cases that are hard to classify. Should shortwave broadcast be regarded as a relationship of instrument, medium, or something else between shortwave and radio? We leave open the possibility of defining our relations so that more than one applies. Sometimes, as in this case, a unique interpretation isnt clear (and we may not want to be forced to choose one). Because our goal is to improve a question-answering and document retrieval system, we wish to find relationships that are similar enough to be useful. Thus a mention of radio broadcasts in a question should be interpreted using a relation that is identical to, or allows us to infer, the relation between, say, program and radio, in a document mentioning radio programs or programs on the radio. In contrast, the relationship between the nouns in shortwave frequencies is sufficiently different from that between radio and broadcast that we would not wish to assign the same relation to it.
Aside from making choices in individual cases, and deciding what degree of overlap can be allowed, we also are concerned with the inventory of heuristics as a whole. The challenge involved in creating an optimal inventory of heuristics is to satisfy the criteria of accuracy and generality simultaneously. That is, each NNC encountered by the system needs to be classified into a semantic relationship that is both narrow enough to be informative, and broad enough to extend beyond the single encountered example.
3. Some recent approaches to noun-noun compound interpretation
Recent classification proposals for NN compounds address these issues from two perspectives. Vanderwende (1995) bases a set of diagnostics for NNCs on the basis of WH-word classifications. That is, the function of the relation between two nouns in a compound is to match question types. Each class, under this analysis, corresponds to a WH-word. For example, the NNC garden party is assigned to the Where class, because it answers the question: Where is the party? Effectively, then, this is a way of saying that the relationship between the members of the compound is that the location of the head (party) is given by the modifier (garden). Vanderwendes classifications are applied according to the following criteria:
Rule Type this can be either:
Modifier-based, if the semantic feature of the attribute (i.e. the defining feature of the compound) is on the modifier N
Head-based, if the semantic feature of the attribute is on the head N
Head-deverbal, if the semantic feature of the attribute is found on the verb corresponding to the noun
Modifier feature the semantic features that are tested for the modifier noun
Head feature the semantic features that are tested for the head noun
For example, the NNC pacifist vote has the following analysis under Vanderwendes system:
Class: Who/What?
Rule Type: Head-based (because the semantic focus, being the class-correspondent in this case is vote)
Modifier feature: +HUMAN (because pacifist represents a human)
Head Feature: +SPEECH ACT (because vote represents a speech act)
The descriptive apparatus used here basically says that the NNC pacifist vote is a speech-act type N modified by a human type N which can be used as an answer to the question Who or What. The advantage of such an approach is the ability to provide a complete semantic classification of each noun in the compound, and, in turn, to associate such classification with question types. Such an approach could be considered a holistic approach to NNC classification in that the external relations of the compound as a whole are the focus rather than the internal relations (i.e. the relationship which N1 has to N2).
Vanderwendes classification relies on the varieties of WH-words in the language in order to determine relationships assigned to NNCs. It is unclear that there is a necessary, deep, and tight connection between these two domains (except in the broadest sense, in which any modifier can be considered an answer to the question: What kind?). We have furthermore found in our work on question answering systems that more specific domain-dependent relationships between nouns prove useful in interpreting NNCs. Thus, restricting the set of relationships to the fairly broad ones associated with WH-words is not ideal for our purposes, though it provides a good starting point.
The analysis presented in Hobbs et al (1993) takes a different approach. This analysis starts with the assumption that the salient problem involved in compound nominal resolution is defining the relationship between N1 and N2.1 The representational apparatus chosen to represent NN relations is a first-order logic simulation whereby variables are mapped to noun-compounds. For the NNC oil sample, for example, the representation is the following:
("x,y)sample(y,x) à nn(x,y)
This handles the case where the head noun (x) is a relational noun, and the modifying noun fills one of its roles. Where the head noun is not inherently relational, three propositions are used, one for each noun, and one for the relation between them. For example, in the NNC turpentine jar, Hobbs et al. give the following formula:
($x,y)turpentine(y) Ù jar(x) Ù nn(y,x)
The authors argue that proving the statement nn(y,x) constitutes determining the implicit relation between the nouns. This determination, in turn, is dependent upon knowing and representing the salient features of the conjuncts turpentine and jar. In addition, our real-world knowledge, which tells us that jars can contain turpentine, is related to the fact that jars can contain liquid. The problem of representing this connection, however, is based upon the fact that the pragmatic inference relating turpentine to liquid in this case goes the wrong way. That is, we know that jars can contain liquids, and one of the ways that a liquid can manifest itself in the world is to be turpentine. That is, what we would like to describe the pragmatic effect of turpentine in the NNC turpentine jar is the following:
("x)liquid(x) É turpentine(x)
This, however, would be saying that all liquids in the world are turpentine. To say that all liquids with certain properties (not specified in the context given) can be turpentine, Hobbs et al. use the predicate etc:
("x)liquid(x) Ù etc(x) É turpentine(x)
One way that the authors express the meaning of the (etc) predicate is to apply an inference that they refer to as weighted abduction. This amounts to saying that the cost of interpreting elements with a low semantic contribution is high. By the same token, high semantic-contributors have correspondingly high worth for their cost. Thus, saying that something is a liquid only helps us slightly in assuming that it may be turpentine, and therefore the term liquid in the above proof would be given a lower value than the term etc., which represents the particular properties of such a liquid.
Since we also know that liquids are contained in jars, this same formula can be applied to the NNC to express the idea that a containment relation holds in NNCs:
("e,x,y) contain(e,x,y) É nn(y,x)
The system proposed in Hobbs et al. (1993) differs from the analysis in Vanderwende (1995) in that the primary concern is to represent the relation between N1 and N2 in NNCs, as is ours. It shares with Vanderwendes system the exhaustive semantic representation of each individual NN component. While this makes the system highly extensible and appropriate for knowledge representation, it is not well suited for systems of a practical size and speed. Some of the same ideas of class inclusion are captured in the WordNet hierarchy and representable in our system without the need to state them as proofs. Thus, for example, we represent an NNC like elm tree with the relation Hyponym, because elm is a type of tree. The relation of hyponymy is already encoded in Wordnet. Furthermore, all the other types of things that share properties of elms, like oak or pine will be listed in a WordNet synset and therefore will fall under the same relation.
4. Using WordNet in determining NN relationships
Our operational definition of interpreting an NNC consists of two things: first, choosing the correct senses of each noun, which is but a part of the general lexical disambiguation problem, and second, selecting among our set of predefined relationships the one (or possibly more) that best captures how the two senses should be related.
Which relationships we choose to represent depends on how deep and detailed our systems analysis needs to be. Using only some sort of uniform mystery relation (like Hobbs et als high-cost nn relation) between the nouns in any compound is unlikely to be useful, as is assuming a unique relationship for every distinct compound. Vanderwende suggests that the number of useful relationships can be fairly small, but again this question has application and domain dependent answers. For a medical domain, a system might make use of many relationships between disease conditions or disease causing organisms and their host, while a single umbrella relationship might suffice for these in another domain. We have approached this issue in a less a priori fashion, seeking to find and distinguish relations where it seems important to do so, and leaving the inventory of relationships open-ended. Our inventory of relationships can be extended and enriched for specific domains, though once we have decided on a set it cannot be changed at run time. The relationships are not necessarily mutually exclusive, and one might imply others. In the latter case, we would consider a correct result to be selecting the most specific relationship that applies. For example, a relationship of EquipmentBrand (e.g., Trek bicycle) implies the broader relationship MadeBy (covering artisan jewelry and beaver dam as well as Trek bicycle). EquipmentBrand would then be the correct relationship for Trek bicycle. Note also that these relationships are intended to cover not only compound nominals, but also prepositional modifiers of noun phrases and other constructions where two entities are related. Thus our system should yield an interpretation of bicycle made by Trek that will match Trek bicycle.
Some additional examples of noun-noun relationships and compounds they are intended to apply to may be useful at this point. Much like Vanderwende, we have relations signifying that N1 is the Location of N2 (bedroom window or, more abstractly, network traffic) and the rarer inverse (account domain), that N1 is the Time of N2 (afternoon meeting, Renaissance art), that N1 is the Purpose of N2 (user interface, adjustment screw), that N1 is the Means or Instrument of N2 (dialup connection, radio broadcast), and so on. We also have relations that are tied to syntactic relations of verbs, which are useful for NNCs headed by nominalizations: Subject (insect flight, computer crash), Object (waste disposal), and others. Additional relations, not so obviously corresponding to Vanderwendes inventory, are Copular (target zone, child actor) and Hyponymic (elm tree, tuna fish). This is not intended to be a complete list; in all, we have about thirty such relationships.
Our lexicon is based on WordNet, but has been considerably expanded and revised. For the purposes of this paper, though, most of these modifications can be ignored. We use WordNet as a classification of noun senses2 and as a rough measure of the semantic distance between senses. The latter is particularly important here, as our strategy for hypothesizing noun-noun relationships relies on the following assumption:
Suppose we have an NNC in which the first noun has the sense S1 and the second has the sense S2. If S1 bears the relationship R to S2 and S1 is a sense semantically close to S1 (via links in WordNet), then the same relationship may well hold between the nouns in a compound with S1 as the sense of its first noun and S2 for the second. Similarly, the same relationship may well hold for a compound in which an S2 semantically close to S2 is substituted. A clear set of examples is plant names; alongside elm tree we have oak tree, pine tree, palm tree, and, changing both nouns: berry bush, cherry tree, grape vine, and tomato plant. Troublesome counterexamples to our assumption exist, naturally (compare mosquito net and butterfly net), but it is a good initial strategy.
Starting with a prototype NNC, we can allow for similar compounds, with semantically close nouns, to be assigned the same relationship by specifying a region of the WordNet hierarchy around each noun sense. Our rule will then be that if we have another compound with first and second noun senses in the same region as the prototypes, the relationship in that compound (between N1 and N2) will be the same. We allow any subtree of WordNet noun senses to be designated as a region. We treat a heuristic for assigning a relationship as an ordered pair of sense IDs (or as a pair of WordNet synsets). A heuristic applies to an NNC iff there exists a usage of the first noun in the same synset as the heuristics first sense ID or in a hyponym of that synset, and a usage of the second noun in the same synset as the heuristics second sense ID or in a hyponym of that synset. Every heuristic is associated with one of the noun-noun relationships we posit. In general, there will be several heuristics associated with a noun-noun relation, because there are different portions of the WordNet hierarchy from which the nouns in semantically similar NNCs are found.

Fig. 1.
Two possible NN candidates to match the heuristic relCategoryOf : product category and font family. Word senses with direct hyponym and synonym paths shown in bold.
In Figure 1, the possible applications of the heuristic relCategoryOf to the NNCs product category and font family are shown. For the first case, product category, at least one path from each sense of each word in the compound will reach the correct heuristic. For font family, there are several senses of both font and family to consider, but we simplify here by taking only one sense of family; the three senses of font all meet the criterion of the heuristic, which requires only that the first noun of an NNC be a hyponym of entity%1:03:00::, the root of the noun hierarchy. Naturally, the chance of a match is inversely proportional to the specificity of the heuristic.
Each of our heuristics is associated with one of the noun-noun relationships discussed earlier. There may be compounds containing nouns from several areas of the WordNet hierarchy, which we need to cover with different heuristics, but which we wish to assign the same relationship. For example, the Copular relation plausibly applies to compounds in which the first word refers to some kind of indicator or target (target zone, goal area, indicator light) and to tree names (elm tree, ponderosa pine). We have two heuristics to handle these two sets of compounds, but both are associated with the same Copular relation. Since the coverage of individual heuristics is governed not so much by principles as by quirks of WordNets organization and of English, it makes sense to keep them separate from the relationships used in interpretations.
The strategy of treating compounds with semantically close words alike, though obviously not foolproof, is flexible because it allows for any number of heuristics and noun-noun relationships to be defined, for more than one relationship to hold between the nouns in a compound, and for incremental changes in the lexicon and the heuristics that are used for determining which relationship(s) are applicable.
We will now address two questions that arise from this approach. When more than one relationship might be postulated for a compound, how do we decide which one(s) to choose? And how does lexical ambiguity of the nouns in a compound affect and interact with choosing an interpretation for the compound?
In practice, we have addressed the first issue by checking which heuristics apply, and then choosing one of those according to metrics (computed from the WordNet hierarchy) of how specific a heuristic is (the more specific, the better) or how close the synsets in the heuristic are to those in the compound (the closer, the better). Other algorithms are possible; for example, the set of heuristics might simply be applied in a fixed order, and the first one that applies would be chosen. We will discuss below some of the refinements we have tested with the specificity and closeness metrics.
The specificity metric we have used is to add the number of synsets that are descendents of the two synsets in the heuristic. While this is a crude measure of how specific the synsets in a heuristic are, it seems preferable to, say, summing the path lengths from the root to the two synsets, as these are more dependent on individual arbitrary decisions about the arrangement of noun senses in the hierarchy. Using our specificity metric, more specific heuristics will have smaller numbers, because they have fewer descendent synsets. In a descendent synset, we consider the descendent nodes to be dominated by the parent node. Often, a single parent node will directly dominate multiple child-nodes, creating a bushy tree. We consider such a parent node to be less specific than one that directly dominates fewer nodes.3 Again, the essential motivation behind this measure is to choose specialized heuristics with narrow applicability over broader ones, because we can provide a more accurate noun-noun relationship for a narrower range of compounds than we could for a broader one.
Table 1 shows the calculation of a rough specificity measure for heuristics, determined by summing the distances between root node of the noun hierarchy and the two synsets specified by the heuristic. (In practice, we measure specificity by calculating the number of descendent synsets.) The closeness metric (not shown in Table 1) we define by counting the number of hyponym links on the shortest path from a synset containing a noun usage to the synset in the heuristic, for both the first and second nouns and heuristic synsets. Smaller numbers therefore indicate heuristics that are closer to the compound. Here, the motivation is that we prefer heuristics with synset pairs that are close to the senses of the nouns in the compound, again because we can provide a more accurate guess at the noun-noun relationship than we can with a more distant heuristic.
N1
synset N2
synset Dist. root
to N1 synset Dist. root
to N2 synset Sum
of dist. relAttrOfBcgd display%1:06:01 background%1:06:01 7 7 14 relConnTo_2 entity%1:03:00 connection%1:24:00 1 2 3 relCommForFnctOf act%1:03:00 publication%1:10:00 1 4 5 relMemberOf group%1:03:00 member%1:18:00 1 5 6 relPartOf_15B program%1:10:02 area%1:07:00 5 4 9
Heuristic
Table 1.
Sample set of five heuristics showing senseIDs for N1 and N2, the distance between N1, N2 and their respective unique beginners (i.e. root nodes) in the Wordnet hierarchy, and the sum of the distance of N1 and N2 to their respective unique beginners. This measure of specificity is an alternative to the one discussed in the text, which totals the number of synset nodes dominated by the two synsets in the heuristic.
Figure 2 below shows a scenario in which there are two senses of two nouns in a compound. In this example, the closeness and specificity measures yield different results, as shown in Table 2.

Fig. 2.
Two heuristics and their application to an NNC, in which each noun has two senses.
|
N1 sense |
N2 sense |
Specificity h |
Specificity k |
Closeness h |
Closeness k |
|
N1S1 |
N2S1 |
* |
n.a. |
6 |
n.a. |
|
N1S1 |
N2S2 |
n.a. |
n.a. |
n.a. |
n.a. |
|
N1S2 |
N2S1 |
n.a. |
* |
n.a. |
7 |
|
N1S2 |
N2S2 |
n.a. |
* |
n.a. |
5 |
Table 2.
Choice of heuristics for an NNC in which each noun has two senses. The two heuristics h and k apply to the NNC for some, but not all, of the possible combinations of senses. When the specificity metric is used to select a sense, heuristic h is chosen, but when the closeness metric is used, k is chosen Here, specificity is calculated using the number of synsets dominated by the synsets in the heuristic.
In the diagram and table above, the results of applying the specificity measure are compared to the results of applying the closeness measure with both senses of each noun in the compound (i.e. for both senses of N1 and N2). The specificity measure chooses heuristic h over heuristic k, for any sense combination that is applicable, since the synsets of h dominates fewer nodes than those of k for the combination (and since h dominates both Ns, a necessary precondition for applying the heuristic). The closeness measure, on the other hand, computes a path of length 6 for h when considering sense 1 of N1 and sense 1 of N2 (i.e. 3 nodes for each N) and will not calculate k, since k does not dominate N1S1 (note: the table abbreviates sense as S). But, for a combination of N1 sense 2 and N2 sense 2, since h does not apply, k has a path-length of 4, less than the length for any either heuristic with any other combination of senses. This means that, under the closeness metric, heuristic k is selected, along with the sense 2 for N1 and sense 2 for N2.
Combining these two measures is also a possibility. They are not truly comparable in the form we have just stated them, but we can modify the specificity metric so that it is combinable with the closeness metric. Rather than using the raw number of descendent synsets, we can use the log of this number, which is related to the average length of a path to a leaf node in the hierarchy from that point. Since the branching factor in the WordNet noun hierarchy is approximately 4, we use the log base 4 of the number of descendant synsets as the revised specificity measure (summing the two numbers, one for each element of the heuristic). This can then be added to the number obtained for the closeness metric. Weighting of the two metrics forming the combined measure can of course be varied.
Another parameter that we will explore is the relative importance of the specificity or closeness measures for the head and modifier nouns in a compound. Summing these numbers weights them equally, but this too could be varied, so that if a particular heuristic applies, its closeness measure might depend heavily on the head, rather than the modifier.
Now we need to address how lexical ambiguity interacts with these metrics. Many of the nouns we encounter in compounds are multiply ambiguous. This is sometimes an artifact of WordNet, but most often it reflects genuine lexical ambiguity. The problem this poses should be clear; when we consider all the senses of the nouns in a compound, rather than just the intended ones, various heuristics may apply to these irrelevant senses, and some of them may have better measures than the correct heuristic does applying to the correct sense. We can circumvent this problem to the extent that we can disambiguate the nouns successfully before attempting to interpret the compound they form, but this is not always feasible. A simple strategy for weeding out rare senses of nouns that are unlikely to be the ones used in the compound, is to boost the score of the most common sense (we have at our disposal usage counts for each sense from which we can determine the most common sense). Since lower scores are better ones for specificity and closeness, this translates into subtracting some points from the score of a heuristic when it applies to a compound and one or both of the nouns is assumed to used in the most common sense. Other, more sophisticated uses of the usage count information can be imagined, but this is the one we have explored so far.
5. Results
We have tested our methods on a set of 388 NNCs, for the most part selected from source documents used in implementations of our question-answering system. We developed our heuristics in tandem with this set, so there is admittedly a danger of cheating; that is, of tailoring our heuristics to account for the compounds in our test set but failing to ensure that they handle new data well. Preliminary results on a test set of 1,219 additional NNCs shows promising results, however. Approximately 85% of the compounds were assigned a heuristic. While this is not as high as the assignment percentage of the run discussed here (which is above 93%), this corpus is over three times as large, so some decline in assignment percentage is expected. Further evaluation of these results is currently underway. In this section we report some early results we have obtained using the specificity and closeness metrics and the enhanced version of WordNet that our lexicon is based on. For each compound we decided which senses of the nouns and which heuristic are correct (and hence which noun-noun relationship is the correct one). We then applied the specificity and closeness metrics for ordering the heuristics (there are about 150 of them) to our set of NNCs. For 27 of the NNCs, no heuristic was applicable; for the remaining 361 at least one heuristic applies.
Generally, the results favor the closeness metric over the specificity metric (we have not yet examined the effects of combined metrics). Results with the specificity and closeness metrics appear in the following table (percentages are of the total 388, not the 361 assigned a heuristic, so they do not include the 27 (6.95%) of cases in which no heuristic applied):
Specificity Closeness correct incorrect correct incorrect 113 (29.1%) 248 (63.9%) 161 (41.5%) 200 (51.5%)
Table 3.
Comparison of Specificity and Closeness Metrics in Assigning Correct Heuristics to NNCs
These numbers are lower than we wish them to be, of course, but it is worth bearing in mind that even when an erroneous heuristic is chosen, it may be associated to the same noun-noun relationship as the correct one. For example, connection period was incorrectly assigned the heuristic relTimeTakenBy under the closeness metric, instead of relActionPeriod, but both of these heuristics are associated with the relationship MeasureOf, so no harm is done. Similarly, for newsgroup discussion, the incorrect relActionInLocation_1 is closer than relInLocation_2, but both are associated with the relationship Location. Thus the effective performance is somewhat higher than these figures indicate.
There was considerable overlap in the behavior of the specificity and closeness metrics but also noticeable differences. For 98 of the compounds in our sample, both metrics chose the correct heuristic, while in 185 cases neither did. There were 63 cases in which the closeness metric chose the correct heuristic and the specificity metric did not, but only 15 in which the specificity metric chose correctly and the closeness metric did not. One such case is human eye, for which the heuristic relAbilityOf won out on the closeness measure over the correct relOrganismBodyPart. This happened because there is a sense of eye meaning discernment (as in: "She has an eye for fresh talent.") which is just three links away in WordNet from the sense of ability used in the relAbilityOf heuristic, and the intended sense of eye (organ of sight) is also three links away from body part. However, there is a sense of human that happens to have a short path to entity (the first synset of the pair used in relAbilityOf), but which has a rather long path to organism (the first synset of the pair used in relOrganismBodyPart), because each larger taxonomic category (hominid, primate, mammal, vertebrate, etc.) adds a link in this path. For organisms not classed as sentient entities in our version of WordNet, the short path to entity will not exist, and the correct heuristic would likely be chosen under both metrics. This example shows some of the challenges faced when estimating semantic distance from a structure like WordNet; certain hyponymy links, such as those relating biological taxonomic categories, should perhaps be weighted less than others, but tuning the weights would involve a large amount of effort. Another possibility is to penalize the less common discernment sense of eye involved in the erroneous heuristic assignment. This method can be at least partly implemented using existing information on relative frequencies of word senses, but whether it would compensate for the long path from human to organism is questionable.
6. Conclusion
We have explored methods for interpreting NNCs based on the structure of WordNet. We have formulated heuristics to capture the likely similarity among compounds with nouns that lie in the same WordNet region. Each heuristic is associated with broader noun-noun relationships that are employed throughout the rest of our system. We have examined two metrics for choosing the best heuristic in situations where multiple heuristics can apply to an NNC, and presented some preliminary results.
We believe that this approach can capture many of the same benefits of the very robust system presented in Hobbs et al. (1993) in our system without creating a system dependent on intractable amounts of knowledge representation and inferencing. Such systems prove difficult for practical application (see Hobbs et al.s discussion of Reiger 1974). At the same time we are able to provide more coverage than the system argued for in Vanderwende (1995).
Acknowledgements
We would like to thank Marti Hearst for extensive discussions, George Krupka and Stephan Greene for crucial assistance in implementing the ideas presented here, and Jamie Hamilton and Paul Jacobs for their support of these efforts.
References
Finin, T.W. (1980). The Semantic Interpretation of Compound Nominals, Ph.D. dissertation, University of Illinois, Urbana-Champaign
Fellbaum, Christiane (ed.) (1998). WordNet: An Electronic Lexical Database, The MIT Press, Cambridge, MA.
Hobbs, Jerry R., Mark E. Stickel, Douglas E. Appelt, and Paul Martin (1993). Interpretation as Abduction, Artificial Intelligence, 63:1-2, pp. 69-142.
Rieger, C.J. III (1974). Conceptual Memory: A Theory and Computer Program for Processing the Meaning Content of Natural Language Utterances. Memo AIM-233, Stanford Artificial Intelligence Laboratory, Stanford University.
Vanderwende, Lucretia H. (1995). The Analysis of Noun Sequences using Semantic Information Extracted from Online Dictionaries, Ph.D. dissertation, Dept. of Linguistics, Georgetown University.
WordNet website: http://www.cogsci.princeton.edu/~wn/
1 Since Hobbs et al. (1993) discuss NN compounds with two members or more, their relation holds between N(n) and (n-1)
2 WordNet distinguishes separate senses of nouns, organizing synonymous senses into synsets, which are connected to one another with various types of links. For example, woodchuck and groundhog have senses that are in the same synset. In this paper, we discuss only one type of link, that expressing hyponymy/hypernymy (woodchuck is a hyponym of rodent). Other types of WordNet links, such as part-whole links, might also prove useful in interpreting NNCs.
3 For simplicity, the distance relations shown in the chart below simply count the distance in links (the less desirable measure mentioned in the text), rather than the sum of the number of descendent synset nodes of the two heuristic synsets.