- National Centre for Text Mining
-
The National Centre for Text Mining (NaCTeM) was the world’s first publicly funded text mining (TM) centre.[citation needed] It was established to provide support, advice, and information on TM technologies and to disseminate information from the larger TM community, while also providing tailored services and tools in response to the requirements of the United Kingdom academic community.
The software tools and services which NaCTeM supplies allow researchers to apply text mining techniques to problems within their specific areas of interest - examples of these tools are highlighted below. In addition to providing services, the Centre is also involved in, and makes significant contributions to, the text mining research community both nationally and internationally in initiatives such as UK PubMed Central.
The Centre is located in the Manchester Interdisciplinary Biocentre and is operated and organized by the University of Manchester School of Computer Science in close collaboration with the Tsujii Lab, University of Tokyo. NaCTeM contributes expertise in information extraction, natural language processing and parallel and distributed [data mining] systems.
Contents
Services
TerMine is a domain independent method for automatic term recognition which can be used to help locate the most important terms in a document and automatically ranks them.
AcroMine finds all known expanded forms of acronyms as they have appeared in Medline entries or conversely, it can be used to find possible acronyms of expanded forms as they have previously appeared in Medline and disambiguates them.[1]
Medie is an intelligent search engine, for semantic retrieval of sentences containing biomedical correlations from Medline abstracts.
Facta+ is a MEDLINE search engine for finding associations between biomedical concepts.[2]
KLEIO is a faceted semantic information retrieval system based on MEDLINE.
Info-PubMed provides information and graphical representation of biomedical interactions extracted from Medline using deep semantic parsing technology. This is supplemented with a term dictionary consisting of over 200,000 protein/gene names and identification of disease types and organisms.
Resources
BioLexicon a large-scale terminological resource for the biomedical domain
GENIA a collection of reference materials for the development of biomedical text mining systems
References
- ^ Okazaki N, Ananiadou S (2006). "Building an abbreviation dictionary using a term recognition approach.". Bioinformatics 22 (24): 3089–95. doi:10.1093/bioinformatics/btl534. PMID 17050571. http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=17050571.
- ^ Tsuruoka Y, Tsujii J, Ananiadou S (2008). "FACTA: a text search engine for finding associated biomedical concepts". Bioinformatics 24 (21): 2559–60. doi:10.1093/bioinformatics/btn469. PMC 2572701. PMID 18772154. http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&tool=sumsearch.org/cite&retmode=ref&cmd=prlinks&id=18772154.
S.Ananiadou (2007) The National Centre for Text Mining: A Vision for the Future http://www.ariadne.ac.uk/issue53/ananiadou/
Frantzi, K., Ananiadou, S. and Mima, H. (2000) Automatic recognition of multi-word terms. International Journal of Digital Libraries 3(2), pp.117-132.
External links
Categories:- Computational linguistics
- Computer science organizations
- Information retrieval
- Linguistics organizations
Wikimedia Foundation. 2010.