Latent Semantic Structure Indexing

Latent Semantic Structure Indexing

Latent Semantic Structure Indexing (LaSSI) is a technique for calculating chemical similarity derived from Latent semantic analysis (LSA).

LaSSI was developed at Merck & Co. and patented in 2007 [http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=7219020] by Richard Hull, Eugene Fluder, Suresh Singh, Robert Sheridan, Robert Nachbar and Simon Kearsley.

Overview

LaSSI is similar to LSA in that it involves the construction of an occurrence matrix from a corpus of items and the application of singular value decomposition to that matrix to derive latent features. What differs is that the occurrence matrix represents the frequency of two- and three-dimensional chemical descriptors (rather than natural language terms) found within a chemical database of chemical structures. This process derives latent chemical structure concepts that can be used to calculate chemical similarities and structure-activity relationships for drug discovery.

References

* Hull, R.D., Fluder, E.M., Singh, S.B., Nachbar, R.B., Sheridan, R.P. and Kearsley, S.K. (2001) "Latent semantic structure indexing (LaSSI) for defining chemical similarity." J Med Chem, 2001 Apr 12;44(8):1177-84.

* Hull, R.D., Singh, S.B., Nachbar, R.B., Sheridan, R.P., Kearsley, S.K. and Fluder, E.M. (2001) "Chemical similarity searches using latent semantic structure indexing (LaSSI) and comparison to TOPOSIM." J Med Chem, 2001 Apr 12;44(8):1185-91.

* Singh, S.B., Sheridan, R.P., Fluder, E.M. and Hull, R.D. (2001) "Mining the chemical quarry with joint chemical probes: an application of latent semantic structure indexing (LaSSI) and TOPOSIM (Dice) to chemical database mining." J Med Chem, 2001 May 10;44(10):1564-75.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Latent semantic analysis — (LSA) is a technique in natural language processing, in particular in vectorial semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA was …   Wikipedia

  • Latent Dirichlet allocation — In statistics, latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups which explain why some parts of the data are similar. For example, if observations are words collected… …   Wikipedia

  • Index (search engine) — Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and… …   Wikipedia

  • Information retrieval — This article is about information retrieval in general. For the fictional government department, see Brazil (film). Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for… …   Wikipedia

  • Text Mining — Text Mining, seltener auch Textmining, Text Data Mining oder Textual Data Mining, ist ein Bündel von Analyseverfahren, die die algorithmusassistierte Entdeckung von Bedeutungsstrukturen aus un oder schwachstrukturierten Textdaten ermöglichen soll …   Deutsch Wikipedia

  • Semantics — is the study of meaning in communication. The word derives from Greek σημαντικός ( semantikos ), significant , [cite web|url=http://www.perseus.tufts.edu/cgi bin/ptext?doc=Perseus%3Atext%3A1999.04.0057%3Aentry%3D%2393797|title=Semantikos, Henry… …   Wikipedia

  • Natural language processing — (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages; it began as a branch of artificial intelligence.[1] In theory, natural language processing is a very attractive… …   Wikipedia

  • Document classification — or document categorization is a problem in both library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done manually (or intellectually ) or algorithmically.… …   Wikipedia

  • Scale-space segmentation — or multi scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing. One dimensional hierarchical signal segmentationWitkin s seminal work in scale… …   Wikipedia

  • Classification spectrale (intelligence artificielle) — En intelligence artificielle, le terme classification spectrale désigne une famille d algorithmes de classification non supervisée. Cette dernière est de plus en plus usitée, à la fois en raison de son efficacité, et de sa simplicité relative d… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”