Data management plan

Data management plan

A data management plan is a formal document that outlines how you will handle your data both during your research, and after the project is completed [1]. The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this ensure that data are well-managed in the present, and prepared for preservation in the future.

Contents

Importance

Preparing a data management plan before data are collected ensures that data are in the correct format, organized well, and better annotated[2]. This saves time in the long term because there is no need to re-organize, re-format, or try to remember details about data. It also increases research efficiency since both the data collector and other researchers will be able to understand and use well-annotated data in the future. One component of a good data management plan is data archiving and preservation. By deciding on an archive ahead of time, the data collector can format data during collection to make its future submission to a database easier. If data are preserved, they are more relevant since they can be re-used by other researchers. It also allows the data collector to direct requests for data to the database, rather than address requests individually. Data that are preserved have the potential to lead to new, unanticipated discoveries, and they prevent duplication of scientific studies that have already been conducted. Data archiving also provides insurance against loss by the data collector.

Funding agencies are beginning to require data management plans as part of the proposal and evaluation process.[3]

Major Components

Information about data & data format

  • Include a description of data to be produced by the project. This might include (but is not limited to) data that are:
    • Experimental
    • Observational
    • Raw or derived
    • Physical collections
    • Models
    • Simulations
    • Curriculum materials
    • Software
    • Images
  • How will the data be acquired? When and where will they be acquired?
  • After collection, how will the data be processed? Include information about
  • Describe the file formats that will be used, justify those formats, and describe the naming conventions used.
  • Identify the quality assurance & quality control measures that will be taken during sample collection, analysis, and processing.
  • If existing data are used, what are their origins? How will the data collected be combined with existing data? What is the relationship between the data collected and existing data?
  • How will the data be managed in the short-term? Consider the following:
    • Version control for files
    • Backing up data and data products
    • Security & protection of data and data products
    • Who will be responsible for management

Metadata content and format

Metadata are the contextual details, including any information important for using data. This may include descriptions of temporal and spatial details, instruments, parameters, units, files, etc. Metadata is commonly referred to as “data about data”[4]. Consider the following:

  • What metadata are needed? Include any details that make data meaningful.
  • How will the metadata be created and/or captured? Examples include lab notebooks, GPS hand-held units, Auto-saved files on instruments, etc.
  • What format will be used for the metadata? Consider the metadata standards commonly used in the scientific discipline that contains your work. There should be justification for the format chosen.

Policies for access, sharing, and re-use

  • Describe any obligations that exist for sharing data collected. These may include obligations from funding agencies, institutions, other professional organizations, and legal requirements.
  • Include information about how data will be shared, including when the data will be accessible, how long the data will be available, how access can be gained, and any rights that the data collector reserves for using data.
  • Address any ethical or privacy issues with data sharing
  • Address intellectual property & copyright issues. Who owns the copyright? What are the institutional, publisher, and/or funding agency policies associated with intellectual property? Are there embargoes for political, commercial, or patent reasons?
  • Describe the intended future uses/users for the data
  • Indicate how the data should be cited by others. How will the issue of persistent citation be addressed? For example, if the data will be deposited in a public archive, will the dataset have a digital object identifier (doi) assigned to it?

Long-term storage and data management

  • Researchers should identify an appropriate archive for long-term preservation of their data. By identifying the archive early in the project, the data can be formatted, transformed, and documented appropriately to meet the requirements of the archive. Researchers should consult colleagues and professional societies in their discipline to determine the most appropriate database, and include a backup archive in their data management plan in case their first choice goes out of existence.
  • Early in the project, the primary researcher should identify what data will be preserved in an archive. Usually, preserving the data in its most raw form is desirable, although data derivatives and products can also be preserved.
  • An individual should be identified as the primary contact person for archived data, and ensure that contact information is always kept up-to-date in case there are requests for data or information about data.

Budget

Data management and preservation costs may be considerable, depending on the nature of the project. By anticipating costs ahead of time, researchers ensure that the data will be properly managed and archived. Potential expenses that should be considered are

  • Personnel time for data preparation, management, documentation, and preservation
  • Hardware and/or software needed for data management, backing up, security, documentation, and preservation
  • Costs associated with submitting the data to an archive

The data management plan should include how these costs will be paid.

NSF Data Management Plan

All grant proposals submitted to NSF must include a Data Management Plan that is no more than two pages [5]. This is a supplement (not part of the 15 page proposal) and should describe how the proposal will conform to the Award and Administration Guide policy (see below). It may include the following:

  1. The types of data
  2.  The standards to be used for data and metadata format and content
  3.  Policies for access and sharing
  4. Policies and provisions for re-use
  5. Plans for archiving data

Policy summarized from of the NSF Award and Administration Guide, Section 4 (Dissemination and Sharing of Research Results)[6]:

  1. Promptly publish with appropriate authorship
  2. Share data, samples, physical collections, and supporting materials with others, within a reasonable time frame
  3. Share software and inventions
  4. Investigators can keep their legal rights over their intellectual property, but they still have to make their results, data, and collections available to others
  5. Policies will be implemented via
    1. Proposal review
    2. Award negotiations and conditions
    3. Support/incentives

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Data management — Gestion des données La gestion des données aussi appelée gestion en jargon informatique comprend toutes les disciplines relatives à la gestion des données en tant que ressources numériques valorisables. Selon la définition de DAMA, la gestion de… …   Wikipédia en Français

  • Clinical data management — encompasses the entry, verification, validation and quality control of data gathered during the conduct of a clinical trial. Contents 1 Role of the Clinical Data Manager in a Clinical Trial 2 Standard Operating Procedures 3 The Data Management… …   Wikipedia

  • Master data management — In computing, master data management (MDM) comprises a set of processes and tools that consistently defines and manages the non transactional data entities of an organization (which may include reference data). MDM has the objective of providing… …   Wikipedia

  • Product data management — (PDM) is the business function within product lifecycle management that is responsible for the creation, management and publication of product data. Introduction Product data management (PDM) is focused on information relative to core operations… …   Wikipedia

  • Commit (data management) — In the context of computer science and data management, commit refers to the idea of making a set of tentative changes permanent. A popular usage is at the end of a transaction. A commit is an act of committing. Contents 1 Data management 2… …   Wikipedia

  • Data sharing — is the practice of making data used for scholarly research available to other investigators. Replication has a long history in science. The motto of The Royal Society is Nullius in verba , translated Take no man s word for it. [1] Many funding… …   Wikipedia

  • Fishery Management Plan — a plan to achieve specified management goals for a fishery. It includes data, analyses, and management measures for a fishery. Abbreviated as FMP …   Dictionary of ichthyology

  • Data warehouse — Overview In computing, a data warehouse (DW) is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations… …   Wikipedia

  • Data architecture — in enterprise architecture is the design of data for use in defining the target state and the subsequent planning needed to achieve the target state. It is usually one of several architecture domains that form the pillars of an enterprise… …   Wikipedia

  • Data collection — is a term used to describe a process of preparing and collecting data, for example, as part of a process improvement or similar project. The purpose of data collection is to obtain information to keep on record, to make decisions about important… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”