CiteSeer

CiteSeer

CiteSeer was a public search engine and digital library for scientific and academic papers. It is often considered to be the first automated citation indexing system and was considered a predecessor of academic search tools such as Google Scholar and Microsoft Academic Search. It was replaced by CiteSeerx and all queries to CiteSeer are redirected to it. It was created by researchers Steve Lawrence, Kurt Bollacker and Lee Giles while they were at the NEC Research Institute (now NEC Labs), Princeton, New Jersey, USA. CiteSeer's goal was to actively crawl and harvest academic and scientific documents on the web and use autonomous citation indexing to permit querying by citation or by document, ranking them by citation impact. After NEC, it was hosted as CiteSeer.IST on the World Wide Web at the College of Information Sciences and Technology, The Pennsylvania State University, and had over 700,000 documents, primarily in the fields of computer and information science and engineering.

CiteSeer freely provided Open Archives Initiative metadata of all indexed documents and links indexed documents when possible to other sources of metadata such as DBLP and the ACM Portal.

CiteSeer's goal was to improve the dissemination and access of academic and scientific literature. As a non-profit service that can be freely used by anyone, it has been considered as part of the open access movement that is attempting to change academic and scientific publishing to allow greater access to scientific literature.

The name can be construed to have at least two explanations. As a pun, a 'sightseer' is a tourist who looks at the sights, so a 'cite seer' would be a researcher who looks at cited papers. Another is a 'seer' is a prophet and a 'cite seer' is a prophet of citations.

CiteSeer had not been comprehensively updated since 2005 due to limitations in its architecture design. It had a representative sampling of research documents in computer and information science but was limited in coverage because it only has access to papers that are publicly available, usually at an author's homepage, or those are submitted by an author. To overcome these limitations, an modular and open source architecture of CiteSeer was designed.

The new version and design of CiteSeer can be found at the Next Generation CiteSeer, CiteSeerx, website. CiteSeer-like engines and archives usually only harvest documents from publicly available websites and do not crawl publisher websites. As such authors whose documents are freely available are more likely to be represented in the index.

Contents

Recent developments

Other CiteSeer engines

The CiteSeer model had been extended to cover academic documents in business with SmealSearch and in e-business with eBizSearch. However, these were not maintained by their sponsors. An older version of both of these could be once found at BizSeer.IST but is no longer in service. For enhanced access and performance, similar versions of CiteSeer were supported at universities such as the Massachusetts Institute of Technology, University of Zürich and the National University of Singapore. However, these versions of CiteSeer proved difficult to maintain and are no longer available.

Versions of CiteSeer have been or are available at the following links:

Other Seer-like search and repository systems have been built for chemistry, ChemXSeer and for archaeology, ArchSeer. Another had been built for robots.txt file search, BotSeer. All of these are built on the open source tool SeerSuite, which uses the open source indexer Lucene.

Next Generation CiteSeer (CiteSeerx)

The Next Generation CiteSeer project, CiteSeerx, funded by the National Science Foundation and initially by Microsoft Research, enhanced CiteSeer both as a search engine and as a digital library and continues in the CiteSeer tradition. As an example, CiteSeer's notion of "contribution" to acknowledgments in addition to citations made it the first automatically generated acknowledgment index.

CiteSeerx is designed differently from CiteSeer with new algorithms for entity extraction and a modular, expandable, robust, scalable architecture based on the open source tool SeerSuite which uses Solr and many other Apache projects. As such, CiteSeerx hopes to promote the creation of other Seer-like systems. This design has permitted CiteSeer"x" to add a new table search feature and a feature for author disambiguation.

CiteSeerx regularly gives away its data resources such as document pdfs, ascii, databases and metadata to other researchers and scholars. The current model of distribution is rsync.

The Next Generation CiteSeer, CiteSeerx, is now available in beta [1] with nearly 2 million documents indexed and is constantly growing.

See also


References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • CiteSeer — CiteSeer  электронная библиотека и поисковая машина по научным публикациям и препринтам. История Система CiteSeer была разработана в 1997 году тремя сотрудниками фирмы NEC  Стивом Лоренсом, Куртом Боллакером и Ли Гилсом  для… …   Википедия

  • CiteSeer — Este artículo o sección sobre tecnología necesita ser wikificado con un formato acorde a las convenciones de estilo. Por favor, edítalo para que las cumpla. Mientras tanto, no elimines este aviso puesto el 13 de junio de 2008. También puedes… …   Wikipedia Español

  • Citeseer — Die CiteSeer Scientific Literature Digital Library (Digitale Bibliothek wissenschaftlicher Literatur) ist eine Suchmaschine und Zitationsdatenbank für frei zugängliche wissenschaftliche Informationen im Internet. Sie umfasst über 720.000… …   Deutsch Wikipedia

  • CiteSeer — Die CiteSeer Scientific Literature Digital Library (Digitale Bibliothek wissenschaftlicher Literatur) ist eine Suchmaschine und Zitationsdatenbank für frei zugängliche wissenschaftliche Informationen im Internet. Sie umfasst über 720.000… …   Deutsch Wikipedia

  • Jacob Bekenstein — Infobox Scientist name = Jacob Bekenstein image width = 240 caption = Jacob Bekenstein at Harvard birth date = birth date and age|1947|5|1 birth place = flagicon|MEX Mexico City, Mexico death date = death place = residence = Jerusalem, Israel… …   Wikipedia

  • Bekenstein — Jacob Bekenstein an der Harvard Universität Jacob David Bekenstein (hebräisch ‏יעקב בקנשטיין‎;  * 1. Mai 1947 in Mexiko Stadt) ist ein israelischer Physiker. Er ist einer der Begründer der Thermodynamik …   Deutsch Wikipedia

  • PreScheme — or Pre Scheme is a statically typed dialect of Scheme with the efficiency and low level machine access of C while retaining many of the desirable features of Scheme.Macro Free PreSchemeMacro Free PreScheme is obtained from Full PreScheme by… …   Wikipedia

  • Hugo de Garis — (born 1947, Sydney, Australia) is a researcher in the sub field of artificial intelligence (AI) known as evolvable hardware. He became known in the 1990s for his research on the use of genetic algorithms to evolve neural networks using three… …   Wikipedia

  • Computational learning theory — In theoretical computer science, computational learning theory is a mathematical field related to the analysis of machine learning algorithms. Contents 1 Overview 2 See also 3 References 3.1 Surveys …   Wikipedia

  • CiteSeerX — es un motor de búsqueda público y biblioteca digital para publicaciones académicas y científicas con especial enfasis en computación, ciencias de la computación. La idea detrás de CiteSeerx es principalmente la misma que su antecesor CiteSeer;… …   Wikipedia Español

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”