- EB-eye EBI's Search Engine
=The 'EB-eye' - EBI's Search Engine for biological databases=
The European Bioinformatics Institute (EBI) is a non-profit academic organisation that forms part of the European Molecular Biology Laboratory ( [http://www.embl.org EMBL] ).
The EBI is a centre for research and services in bioinformatics. The Institute manages databases of biological data including nucleic acid, protein sequences and macromolecular structures.
The Mission of the EBI
* To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress
* To contribute to the advancement of biology through basic investigator-driven research in bioinformatics
* To provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators
* To help disseminate cutting-edge technologies to industryWhat is the EB-eye Search?
The system is developed on top of the [http://lucene.apache.org/ Apache Lucene] project framework, which is an Open-source, high-performance, full-featured text search engine library written entirely in Java. It uses this technology to index EBI databases in various formats (e.g. flatfiles, XML dumps, OBO format, etc.) and provides very fast access to the EBI's data resources. The system allows the user to search globally across all EBI databases or individually in selected resources by using an Advance search.
Access to the EB-eye
* From the top of any page on the EBI web portal.
* From [http://www.ebi.ac.uk/ebisearch/ http://www.ebi.ac.uk/ebisearch/] .Global Search
The global search is available at the top of all [http://www.ebi.ac.uk/ EBI web pages] . Simply type query terms into the text dialogue there and press GO (or press Enter). The system then displays a summary page with the name of the various knowledge domains and the number of matches the search found in these. The user can expand or contract each or all domains by clicking the relevant '+' or '-' signs in the page. When expanded, each data resource is shown along with the number of matches found.
Examples
[http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=insulin+receptor Insulin receptor = http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=insulin+receptor ] [http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=p53 P53 = http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=p53 ] [http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=external+services External Services group = http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=external+services ] [http://www.ebi.ac.uk/ebisearch/entryreferencesearch.ebi?db=allebi&relateddb=taxonomy&relatedid=9913 Bos taurus (cow) data on the EB-eye = http://www.ebi.ac.uk/ebisearch/entryreferencesearch.ebi?db=allebi&relateddb=taxonomy&relatedid=9913] [http://www.ebi.ac.uk/ebisearch/search.ebi?query=escherichia+%21coli&db=allebi escherichia NOT coli = http://www.ebi.ac.uk/ebisearch/search.ebi?query=escherichia+%21coli&db=allebi ] [http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=C2H2+zinc+finger+family+&FormsButton3=Go C2H2 zinc finger family = http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=C2H2+zinc+finger+family+&FormsButton3=Go ] [http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=DNA+binding DNA binding = http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=DNA+binding ]
Advanced Search
These searches are available from the [http://www.ebi.ac.uk/ebisearch/advancedsearch.ebi 'Advanced Search' page] . It first displays four text dialogues for each of the query types that are possible in the system and defaults to searching all the EBI data resources. The 'domain-specific search' in this page takes to user to a very easy to use wizard where it is possible to select data resources individually and choose which fields to query. When multiple data resources are available in a domain the user can select all or just one. After the selection of databases and fields is completed, the 'Advanced search' four text dialogues appear that allow the user to type the query terms of interest. Also see section on this page 'Example Advanced Search'
What can the user Search for?
Many of the text fields of EBI data resources are indexed in the search engine. But some or are not. This implies that searches from other search engines vs. this one will yield different results. As a rule, the search engine has identifiers, names, descriptions, keywords and cross-references indexed. More specific fields will be indexed at a later date as the quality of the data feeds improves.
Using the Advanced Search and selecting a data resource will bring the user to the select fields dialog. There it is possible to see what has been indexed. For example: In UniProt, the following fields are available: id, accession numbers, creation, last modification and sequence last modification dates, description and keywords.
Examples
* Searching for "Search for description:(azurin) in UniProt KB": [http://www.ebi.ac.uk/ebisearch/search.ebi?query=description%3A%28azurin%29&db=uniprot Uniprot description=azurin http://www.ebi.ac.uk/ebisearch/search.ebi?query=description%3A%28azurin%29&db=uniprot Uniprot description=azurin]
* Searching Paired box protein in all databases at the EBI [http://www.ebi.ac.uk/ebisearch/search.ebi?query=paired+box+protein+%21fragment++%21paxillin&db=allebi AllEBI Paired Box Protein BUT NOT fragment OR paxillin = http://www.ebi.ac.uk/ebisearch/search.ebi?query=paired+box+protein+%21fragment++%21paxillin&db=allebi]
* description:(paired box protein) INTERPRO:(IPR001523) in UniProt KB [http://www.ebi.ac.uk/ebisearch/search.ebi?query=description%3A%28paired+box+protein%29++INTERPRO%3A%28IPR001523%29&db=uniprot UniProt (description:paired box protein) with cross references to InterPro:IPR001523 = http://www.ebi.ac.uk/ebisearch/search.ebi?query=description%3A%28paired+box+protein%29++INTERPRO%3A%28IPR001523%29&db=uniprot]
* Finding environmental samples in All EBI [http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=%28bacterium+OR+organism%29+AND+%28unidentified+OR+uncultured%29 (bacterium OR organism) AND (unidentified OR uncultured) = http://www.ebi.ac.uk/ebisearch/search.ebi?db=allebi&query=%28bacterium+OR+organism%29+AND+%28unidentified+OR+uncultured%29 ]
* Ca(2+) oscillation models [http://www.ebi.ac.uk/ebisearch/search.ebi?db=biomodels&query=Ca+oscillation Calcium oscillation = http://www.ebi.ac.uk/ebisearch/search.ebi?db=biomodels&query=Ca+oscillation]It is also possible to search using cross-references. In the same Advanced search select fields dialog it is possible to see which cross-references are indexed.
Help & FAQ on EB-eye
Further pages describing the syntax for quering using this search engine are available on the [http://www.ebi.ac.uk/inc/help/search_help.html EBI's web site] .
Programmatic access to the EB-eye
EB-eye is accessible using
Web Services technologies from the EBI. Please see the [http://www.ebi.ac.uk/Tools/webservices/ main Web Services pages at the EBI] .A WSDL [http://www.ebi.ac.uk/ebisearch/service.ebi?wsdl WSDL] (Web Services Description Language) document is available now from [http://www.ebi.ac.uk/ebisearch/service.ebi?wsdl here]Other Lucene-based search engine in biology/bioinformatics
Lucene has been around for a while now. Many bioinformatic centres have been experimenting with its use with biological data and databases. A pioneering development in this field is headed by Dr. Don Gilbert at Indiana University, called [http://www.gmod.org/lucegene LuceGene] , a part of the [http://www.gmod.org GMOD] (Generic Software Components for Model Organisms Databases) initiative. Another example is the search engine in the [http://beta.uniprot.org/ new UniProt web site] which is also based on Lucene and adds features such as sorting large data sets, subqueries across data sets and group-by queries.
External links
* http://www.embnet.org/ Read the EB-eye article in the latest issue of EMBnet.News
Wikimedia Foundation. 2010.