Million Book Project

Million Book Project: The Million Book Project (or the Universal Library), is a book digitization project, led by Carnegie Mellon University School of Computer Science and University Libraries.^[1] Working with government and research partners in India (Digital Library of India) and China, the project is scanning books in many languages, using OCR to enable full text searching, and providing free-to-read access to the books on the web. As of 2007, they have completed the scanning of 1 million books and have made accessible the entire database from http://www.ulib.org.

Contents

1 Description

2 Partner institutions

2.1 China

2.2 India

2.3 USA

3 See also

4 References

5 External links

Description

Twenty-two scanning centers are operating in India, including four mega-centers. Eighteen centers are running in China, including a mega-center in a free-trade zone to avoid customs delays with shipments of books from the United States. Materials are also being scanned in Egypt, Hawaii, and Carnegie Mellon.

By December 2007 more than 1.5 million books had been scanned, in 20 languages: 970,000 in Chinese; 360,000 in English; 50,000 in Telugu; and 40,000 in Arabic.^[2] Most of the books are in the public domain, but permission has been acquired to include over 60,000 copyrighted books (roughly 53,000 in English and 7,000 in Indian languages). The books are mirrored in part at sites in India, China, Carnegie Mellon, the Internet Archive, Bibliotheca Alexandrina. The books that have been scanned to date are not yet all available online, and no single site has copies of all the books that are available online.

The million book project will provide a wide array of content, but one of its collection strengths will be agriculture. In partnership with the United Nations Food and Agriculture Organization, the United States National Agricultural Library, and university libraries with quality agriculture collections, the project is digitizing materials and developing plans for a knowledge network to improve rural community access to critical agricultural information.

Significant research is underway in the project, including OCR for Indian and Arabic languages and scripts. The research also includes developments in machine translation, automatic summarization, image processing, large-scale database management, user interface design, and strategies for acquiring copyright permission at an affordable cost. Indian partners have developed a translating and transliterating user interface. Partners in Egypt are developing an interface that supports annotation and highlighting. Partners in China have made remarkable progress on content-based image retrieval and machine analysis of calligraphic scripts. Carnegie Mellon has taken strides in machine translation and automatic summarization.

The National Science Foundation (NSF) awarded Carnegie Mellon $3.63M over four years for equipment and administrative travel for the Million Book Project. India is providing $25M annually to support language translation research projects. The Ministry of Education in China is providing $8.46M over three years. The Internet Archive has provided equipment, staff and money. The University of California Libraries at Merced funded the work to acquire copyright permission from U.S. publishers.

India, China, and the USA agreed in November 2005 to join the Open Content Alliance (OCA), initiated by Brewster Kahle and the Internet Archive, because the goals of the OCA are consistent with those of the Million Book Project and the Universal Digital Library.

Partner institutions

China

The institutions in China which are participants in this project include:^[1]

Ministry of Education of the People's Republic of China

Chinese Academy of Science

Fudan University

Nanjing University

Peking University

Tsinghua University

Zhejiang University

North-East Normal University

India

The institutions in India which are participants in this project include:^[1]

Indian Institute of Science, Bangalore

International Institute of Information Technology, Hyderabad

Indian Institute of Information Technology, Allahabad

Anna University, Chennai

Mysore University, Mysore

University of Pune, Pune

Goa University, Goa

Tirumala Tirupati Devasthanams, Tirupathi

Shanmugha Arts, Science, Technology & Research Academy, Tanjore

Arulmigu Kalasalingam College of Engineering, Srivilliputhur

Maharashtra Industrial Development Corporation, Mumbai

USA

The institutions in the U.S. which are participants include:^[1]

Indiana University

Pennsylvania State University

Stanford University

TriColleges (Swarthmore, Haverford, Bryn Mawr)

University of California, Berkeley

University of California, Merced

University of Pittsburgh

University of Washington

See also

Digital library

List of digital library projects

Digital preservation

Repository (publishing)

Universal library

Book scanning

References

^ ^a ^b ^c ^d "ULIB [About Us"]. Carnegie Mellon University. http://www.ulib.org/ULIBAboutUs.htm.

^ "The Million Book Project - 1.5 million scanned!". London Business School Library. http://lbslibrary.typepad.com/bizresearch/2007/11/the-million-boo.html.

External links

The Universal Digital Library

The Million Book Digital Library Project

Frequently Asked Questions

The Million Book Project Database

(Chinese) Universal Library, China site

Universal Digital Library at Allahabad

Digital Library of India

Internet Archive:

the archived pilot

larger partial collection

Categories:
Carnegie Mellon University
Ebook suppliers
Mass digitization

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

Million Book Project — Internet Archive in San Francisco Internet Archive in der … Deutsch Wikipedia
Project Azorian — Hughes Glomar Explorer Coordinates … Wikipedia
Project Stormfury — was an attempt to weaken tropical cyclones by flying aircraft into them and seeding with silver iodide. The project was run by the United States Government from 1962 to 1983.The hypothesis was that the silver iodide would cause supercooled water… … Wikipedia
Project Mercury — Duration 1959 1963 Goal Place Americans into orbit for as long as one day Achievements First manned flight: May 5, 1961 First orbital flight … Wikipedia
Project Habakkuk — or Habbakuk (spelling varies; see below) was a plan by the British in World War II to construct an aircraft carrier out of pykrete (a mixture of wood pulp and ice), for use against German U boats in the mid Atlantic, which were beyond the flight… … Wikipedia
Project Cadmus — Publication information Publisher DC Comics First appearance Superman s Pal Jimmy Olsen #133 (October 1970) Created by … Wikipedia
Project for the New American Century — Formation 1997 Extinction 2006 Type Public policy think tank … Wikipedia
Project Cyclops — [http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19730010095 1973010095.pdf NASA Technical Report CR 114445 Project Cyclops: A design study of a system for detecting extraterrestial intelligent life] . 14.5 MB pdf file.] was a 1971 NASA… … Wikipedia
Project Coast — was a top secret chemical and biological weapons (CBW) program instituted by the South African government during the apartheid era. Project Coast was the successor to a limited post war CBW program which mainly produced the lethal agents CX… … Wikipedia
Million Pound Property Experiment — was a television series in 2003–2004 which aired on BBC Two in the United Kingdom in which designers Colin McAllister and Justin Ryan bought, renovated and re sold properties for a profit. This, as they gambled with a £100,000 loan from the BBC,… … Wikipedia

Academic Dictionaries and Encyclopedias

Million Book Project

Contents

Description

Partner institutions

China

India

USA

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Million Book Project

Contents

Description

Partner institutions

China

India

USA

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Direct link