Question-Answering Systems: Answer Search and Retrieval
-
How are candidate answers found, ranked, and presented to the user?
-
Some systems use modifications of the vector-space model familiar from IR,
but there are many more sophisticated strategies.
-
Mollá Aliod, et al. use a semantically based approach, in which they
convert queries and sentences in the documents into statements in
first-order logic, and use resolution theorem-proving to try to "prove" the
query using the logical assertions derived from a document.
-
Cardie, et al. add syntactic and semantic components to the vector-space approach.
(Note that their system, designed for the "50-byte answer" track of TREC, doesn't
handle how/why questions.)
- They first divide the documents into small chunks, and generate vector-space
representations of each chunk. The most similar chunks of documents are retained
for further consideration (these are the "query-dependent summary extracts").
- Because answers to the kinds of questions the system is designed for are usually
NPs, the system looks for NPs in the summary extracts as possible answers (this is
a very simple filter, but improved performace somewhat).
- Semantic type checking, by analyzing the question and positing restrictions on
the kinds of entities that could be answers, also improved performance.
- Finally, Cardie, et al. used a syntactic similarity measure (how many "base NPs"
appear in a summary abstract) to reorder the document chunks returned and ranked by
the vector-space similarity measure. Performace improved slightly
-
Typically, Q & A systems either present an answer extracted from a document, or highlight
a passage containing the answer in a displayed document.
(3) (5)
(return to syllabus)