Open science data

Open science data

Open science data is a type of Open data focussed on publishing observations and results of scientific activities available for anyone to analyze and reuse. While the idea of open science data has been actively promoted since the 1950s, the rise of the Internet has significantly lowered the cost and time required to publish or obtain data.

Contents

History

The concept of open access to scientific data was institutionally established with the formation of the World Data Center system, in preparation for the International Geophysical Year of 1957-1958.[1] The International Council of Scientific Unions (now the International Council for Science) established several World Data Centers to minimize the risk of data loss and to maximize data accessibility, further recommending in 1955 that data be made available in machine-readable form.[2]

In 1995 GCDIS (US) put its position clearly in On the Full and Open Exchange of Scientific Data (A publication of the Committee on Geophysical and Environmental Data - National Research Council):

"The Earth's atmosphere, oceans, and biosphere form an integrated system that transcends national boundaries. To understand the elements of the system, the way they interact, and how they have changed with time, it is necessary to collect and analyze environmental data from all parts of the world. Studies of the global environment require international collaboration for many reasons:

  • to address global issues, it is essential to have global data sets and products derived from these data sets;
  • it is more efficient and cost-effective for each nation to share its data and information than to collect everything it needs independently; and
  • the implementation of effective policies addressing issues of the global environment requires the involvement from the outset of nearly all nations of the world.
International programs for global change research and environmental monitoring crucially depend on the principle of full and open data exchange (i.e., data and information are made available without restriction, on a non-discriminatory basis, for no more than the cost of reproduction and distribution."

[3]

The last phrase highlights the traditional cost of disseminating information by print and post. It is the removal of this cost through the Internet which has made data vastly easier to disseminate technically. It is correspondingly cheaper to create, sell and control many data resources and this has led to the current concerns over non-open data.

More recent uses of the term include:

  • SAFARI 2000 (South Africa, 2001) used a license informed by ICSU and NASA policies [4]
  • the human genome [5] (Kent, 2002)
  • An Open Data Consortium on geospatial data [6] (2003)
  • Manifesto for Open Chemistry [7] (Murray-Rust and Rzepa, 2004) (2004)
  • Presentations to JISC and OAI under the title "open data" [8] (Murray-Rust, 2005)
  • Science Commons launch [9] (2004)
  • First Open Knowledge Forums (London, UK) run by the Open Knowledge Foundation (London UK) on open data in relation to civic information and geodata [10] (February and April 2005)
  • The Blue Obelisk group in chemistry (mantra: Open Data, Open Source, Open Standards) (2005) doi:10.1021/ci050400b
  • The Petition for Open Data in Crystallography is launched by the Crystallography Open Database Advisory Board.[11](2005)
  • XML Conference & Exposition 2005 [12] (Connolly 2005)
  • SPARC Open Data mailing list [13] (2005)
  • First draft of the Open Knowledge Definition explicitly references "Open Data" [14] (2005)
  • XTech [15] (Dumbill, 2005),[16] (Bray and O'Reilly 2006)

In 2004, the Science Ministers of all nations of the OECD (Organisation for Economic Co-operation and Development), which includes most developed countries of the world, signed a declaration which essentially states that all publicly-funded archive data should be made publicly available.[17] Following a request and an intense discussion with data-producing institutions in member states, the OECD published in 2007 the OECD Principles and Guidelines for Access to Research Data from Public Funding as a soft-law recommendation.[18]

In 2005 Edd Dumbill introduced an "Open Data" theme in XTech, including:

In 2006 Science Commons [19] ran a 2-day conference in Washington where the primary topic could be described as Open Data. It was reported that the amount of micro-protection of data (e.g. by license) in areas such as biotechnology was creating a Tragedy of the anticommons. In this the costs of obtaining licenses from a large number of owners made it uneconomic to do research in the area.

In 2007 SPARC and Science Commons announced a consolidation and enhancement of their author addenda [20]

In 2010 the Panton Principles launched,[21] advocating Open Data in science and setting out for principles to which providers must comply to have their data Open.

Relation to open access

Much data is made available through scholarly publication, which now attracts intense debate under "Open Access". The Budapest Open Access Initiative (2001) coined this term:

By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

The logic of the declaration permits re-use of the data although the term "literature" has connotations of human-readable text and can imply a scholarly publication process. In Open Access discourse the term "full-text" is often used which does not emphasize the data contained within or accompanying the publication.

Some Open Access publishers do not require the authors to assign copyright and the data associated with these publications can normally be regarded as Open Data. Some publishers have Open Access strategies where the publisher requires assignment of the copyright and where it is unclear that the data in publications can be truly regarded as Open Data.

The ALPSP and STM publishers have issued a statement about the desirability of making data freely available [22]:

Publishers recognise that in many disciplines data itself, in various forms, is now a key output of research. Data searching and mining tools permit increasingly sophisticated use of raw data. Of course, journal articles provide one ‘view’ of the significance and interpretation of that data – and conference presentations and informal exchanges may provide other ‘views’ – but data itself is an increasingly important community resource. Science is best advanced by allowing as many scientists as possible to have access to as much prior data as possible; this avoids costly repetition of work, and allows creative new integration and reworking of existing data.

and

We believe that, as a general principle, data sets, the raw data outputs of research, and sets or sub-sets of that data which are submitted with a paper to a journal, should wherever possible be made freely accessible to other scholars. We believe that the best practice for scholarly journal publishers is to separate supporting data from the article itself, and not to require any transfer of or ownership in such data or data sets as a condition of publication of the article in question.

Even though this statement was without any effect on the open availability of primary data related to publications in journals of the ALPSP and STM members. Data tables provided by the authors as supplement with a paper are still available to subscribers only.

See also

External links

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Open Science Grid Consortium — The Open Science Grid Consortium is an organization that administers a worldwide grid of technological resources called the Open Science Grid, which facilitates distributed computing for scientific research. Founded in 2004, the consortium is… …   Wikipedia

  • Open Science Grid — The Open Science Grid is a national production quality grid computing infrastructure for large scale science, built and operated by a consortium of U.S. universities and national laboratories. The OSG Consortium was formed in 2004 to enable… …   Wikipedia

  • Open data — Linking Open Data project in September 2007 …   Wikipedia

  • Open Knowledge Foundation — The Open Knowledge Foundation Founder(s) Rufus Pollock, Martin Keegan, Jo Walsh Type Non profit organization Tax ID No. 05133759[1] Founded 2004 Location …   Wikipedia

  • Open Database License — L Open Database License (ODbL) est une licence de diffusion libre de base de données. Issue de l Open Knowledge Foundation, cette licence est publiée par Open science data (en). Sa traduction en français est le fruit d une collaboration… …   Wikipédia en Français

  • Open Data — is a philosophy and practice requiring that certain data are freely available to everyone, without restrictions from copyright, patents or other mechanisms of control. It has a similar ethos to a number of other Open movements and communities… …   Wikipedia

  • Open Notebook Science — is the practice of making the entire primary record of a research project publicly available online as it is recorded. This involves placing the personal, or laboratory, notebook of the researcher online along with all raw and processed data, and …   Wikipedia

  • Open research — Open Science Open research is research conducted in the spirit of free and open source software. Much like open source schemes that are built around a source code that is made public, the central theme of open research is to make clear accounts… …   Wikipedia

  • Open Data — ist eine Philosophie und Praxis, die auf der Grundidee beruht, dass vorteilhafte Entwicklungen eingeleitet werden, wenn Daten für jedermann frei zugänglich gemacht werden. Dies betrifft insbesondere Abwesenheit von Urheberrechten, Patenten oder… …   Deutsch Wikipedia

  • Open-source intelligence — (OSINT) is a form of intelligence collection management that involves finding, selecting, and acquiring information from publicly available sources and analyzing it to produce actionable intelligence. In the intelligence community (IC), the term… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”