MontyLingua

MontyLingua

MontyLingua is a popular natural language processing toolkit. It is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for both the Python and Java programming languages. It is enriched with common sense knowledge about the everyday world from Open Mind Common Sense. From English sentences, it extracts subject/verb/object tuples, extracts adjectives, noun phrases and verb phrases, and extracts people's names, places, events, dates and times, and other semantic information. It does not require training. It was written by Hugo Liu at MIT in 2003.
Because it is enriched with common sense knowledge it may escape many mistakes. e.g.:

  • "(NX the/DT mosquito/NN bit/NN NX) (NX the/DT boy/NN NX)"


vs.

  • "(NX the/DT mosquito/NN NX) (VX bit/VBD VX) (NX the/DT boy/NN NX)"[1]

Non-commercial use is free. If it is your intent to use this software for non-commercial, non-proprietary purposes, such as for academic research purposes, this software is free and is covered under the GNU GPL License. However it has been forked into full gpl.

Abilities

  • MontyTokenizer: normalizes punctuation, spacing and contractions, with sensitivity to abbrevs.
  • MontyTagger: Part-of-speech tagging using the Penn Treebank tagset, enriched with "Common Sense" from the Open Mind Common Sense project. Exceeds accuracy of Brill94 tbl tagger using default training files
  • MontyREChunker: chunks tagged text into verb, noun, and adjective chunks (VX,NX, and AX respectively)
  • MontyExtractor: extracts verb-argument structures, phrases, and other semantically valuable information from sentences and returns sentences as "digests"
  • MontyLemmatiser: part-of-speech sensitive lemmatisation. Strips plurals (geese-->goose) and tense (were-->be, had-->have). Includes regexps from Humphreys and Carroll's morph.lex, and UPENN's XTAG corpus
  • MontyNLGenerator: generates summaries, generates surface form sentences, determines and numbers NPs and tenses verbs, accounts for sentence_type

References

External Links


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Natural language processing — (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages; it began as a branch of artificial intelligence.[1] In theory, natural language processing is a very attractive… …   Wikipedia

  • Обработка естественного языка — (Natural Language Processing, NLP)  общее направление искусственного интеллекта и математической лингвистики. Оно изучает проблемы компьютерного анализа и синтеза естественных языков. Применительно к искусственному интеллекту анализ означает …   Википедия

  • Программное обеспечение для обработки естественного языка — Эта статья или раздел нуждается в переработке. Пожалуйста, улучшите статью в соответствии с правилами написания статей …   Википедия

  • List of natural language processing toolkits — The following natural language processing toolkits are popular collections of natural language processing software. They are suites of libraries, frameworks, and applications for symbolic, statistical natural language and speech processing. NLP… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”