Open domain question answering

Open domain question answering: In information retrieval, an open domain question answering system aims at returning an answer in response to the user’s question. The returned answer is in the form of short texts rather than a list of relevant documents. The system uses a combination of techniques from computational linguistics, information retrieval and knowledge representation for finding answers.

The system takes a natural language question as an input rather than a set of keywords, for example, “When is the national day of China?” The sentence is then transformed into a query through its logical form. Having the input in the form of a natural language question makes the system more user-friendly, but harder to implement, as there are various question types and the system will have to identify the correct one in order to give a sensible answer. Assigning a question type to the question is a crucial task, the entire answer extraction process relies on finding the correct question type and hence the correct answer type.

Keyword extraction is the first step for identifying the input question type. In some cases, there are clear words that indicate the question type directly. i.e. “Who”, “Where” or “How many”, these words tell the system that the answers should be of type “Person”, “Location”, “Number” respectively. In the example above, the word “When” indicates that the answer should be of type “Date”. POS tagging and syntactic parsing techniques can also be used to determine the answer type. In this case, the subject is “Chinese National Day”, the predicate is “is” and the adverbial modifier is “when”, therefore the answer type is “Date”. Unfortunately, some interrogative words like “Which”, “What” or “How” do not give clear answer types. Each of these words can represent more than one type. In situations like this, other words in the question need to be considered. First thing to do is to find the words that can indicate the meaning of the question. A lexical dictionary such as WordNet can then be used for understanding the context.

Once the question type has been identified, an Information retrieval system is used to find a set of documents containing the correct key words. A tagger and NP/Verb Group chunker can be used to verify whether the correct entities and relations are mentioned in the found documents. For questions such as “Who” or “Where”, a Named Entity Recogniser is used to find relevant “Person” and “Location” names from the retrieved documents. Only the relevant paragraphs are selected for ranking.

A vector space model can be used as a strategy for classifying the candidate answers. Check if the answer is of the correct type as determined in the question type analysis stage. Inference technique can also be used to validate the candidate answers. A score is then given to each of these candidates according to the number of question words it contains and how close these words are to the candidate, the more and the closer the better. The answer is then translated into a compact and meaningful representation by parsing. In the previous example, the expected output answer is “1st Oct.”

See also

Question answering

References

Hutchins, W. John; and Harold L. Somers (1992). An Introduction to Machine Translation. London: Academic Press. ISBN 0-12-362830-X. http://www.hutchinsweb.me.uk/IntroMT-TOC.htm.

L. Fortnow, Steve Homer (2002/2003). A Short History of Computational Complexity. In D. van Dalen, J. Dawson, and A. Kanamori, editors, The History of Mathematical Logic. North-Holland, Amsterdam.

Categories:
Natural language processing
Computational linguistics
Information retrieval

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

Question answering — (QA) is a type of information retrieval. Given a collection of documents (such as the World Wide Web or a local collection) the system should be able to retrieve answers to questions posed in natural language. QA is regarded as requiring more… … Wikipedia
Open domain — An open domain question answering system aims at returning an answer in response to the user’s question. The returned answer is in the form of short texts rather than a list of relevant documents. The system uses a combination of techniques from… … Wikipedia
Watson (computer) — IBM Watson redirects here. For the laboratory, see Thomas J. Watson Research Center. Watson s avatar, inspired by the IBM smarter planet logo[1] Watson is an artificial intelligence computer system capable of answering questions posed in … Wikipedia
Dragomir R. Radev — is a University of Michigan computer science professor working on natural language processing and information retrieval. He is currently working on the fields of open domain question answering, multi document summarization, and the application of … Wikipedia
Voice search — It is also called voice enabled search. It allows the user to use a voice command to search the Internet, or a portable device.Currently voice search is commonly referred to (in a narrow sense) directory assistance , or local search. Examples… … Wikipedia
KABBALAH — This entry is arranged according to the following outline: introduction general notes terms used for kabbalah the historical development of the kabbalah the early beginnings of mysticism and esotericism apocalyptic esotericism and merkabah… … Encyclopedia of Judaism
china — /chuy neuh/, n. 1. a translucent ceramic material, biscuit fired at a high temperature, its glaze fired at a low temperature. 2. any porcelain ware. 3. plates, cups, saucers, etc., collectively. 4. figurines made of porcelain or ceramic material … Universalium
China — /chuy neuh/, n. 1. People s Republic of, a country in E Asia. 1,221,591,778; 3,691,502 sq. mi. (9,560,990 sq. km). Cap.: Beijing. 2. Republic of. Also called Nationalist China. a republic consisting mainly of the island of Taiwan off the SE coast … Universalium
Cyc — For other uses, see CYC (disambiguation). Cyc Developer(s) Cycorp Written in Lisp, CycL Type Ontology and Inference engine Cyc is an artificial intelligence project that attempts to assemble … Wikipedia
Ontology (information science) — In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to reason about the entities within that domain and may be… … Wikipedia

Academic Dictionaries and Encyclopedias

Open domain question answering

See also

References

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Open domain question answering

See also

References

Look at other dictionaries:

Share the article and excerpts

Direct link