- Genomic Standards Consortium
=The Genomic Standards Consortium (GSC)=
The [http://gensc.org Genomic Standards Consortium] is an initiative working towards richer descriptions of our collection of genomes and metagenomes. Established in September 2005, this international community includes representatives from a range of major sequencing and bioinformatics centres (including NCBI, EMBL, DDBJ, JCVI, JGI, EBI, Sanger, FIG) and research institutions. The goal of the GSC is to promote mechanisms for standardizing the description of (meta)genomes, including the exchange and integration of (meta)genomic data. The number and pace of genomic and metagenomic sequencing projects will only increase as the use of ultra-high-throughput methods becomes common place and standards are vital to scientific progress and data sharing.
Key Links for further information about the GSC
The GSC homepage: http://gensc.org contains information on all aspects of the GSC and details of all ongoing and proposed projects. One place to start is the [http://gensc.org/gc_wiki/index.php/GSC_FAQ GSC FAQ]
Defining the scope of the information to be captured: the "Minimum Information about a Genome Sequence" (MIGS) specification
As a first step, the GSC has developed the "Minimum Information about a Genome Sequence" (MIGS) specification, which has recently been extended to create the closely integrated "Minimum Information about a Metagenome Sequence" (MIMS) specification. Capture of information compliant with MIGS/MIMS is possible using the GSC's Genome Catalogue (or GCat) portal (http://gensc.org). The primary aim of developing such new standards is to ensure that those researchers generating (meta) genome sequence data contribute to, and increase, the quality and quantity of contextual (meta) data available. This ensures that meaningful comparisons can be performed across the entire public (meta) genome collection without hindrance, especially from an ecological and environmental perspective. A high level electronic version of the checklist is available here: http://gensc.sourceforge.net/docs/migsmims/.
Capturing and Exchanging data: GCDML and the Genomic Rosetta Stone
As part of implementing MIGS/MIMS in the Genome Catalogue, the checklist has been implemented in XML a widely-used language optimized for data exchange. Interest by many groups within the GSC in extending this mechanism of data capture beyond a straightforward support for MIGS/MIMS led to an agreement to develop a shared mechanism for exchange of any type of curated or calculated “contextual data”. This resource is called the “Genomic Contextual Data Markup Language” (GCDML). A widely adopted version of GCDML, which supports MIGS-MIMS and is compliant with relevant ontologies is the long-term goal of the GSC. Local genome and metagenome identifiers are also being unified in project called the Genomic Rosetta Stone (GRS)
Towards shared vocabularies (controlled vocabularies and ontologies)
The GSC is also supporting the development and use of a ‘shared language’ for describing genomes and metagenomes. Such term lists, in the form of controlled vocabularies and formalized ontologies (which establish relationships among terms) provides a powerful mechanism for rendering metadata reports that are ready for computational analyses. The GSC is a member community of the Ontology for Biomedical Investigations (OBI) initiative (http://obofoundry.org) and a founding community in the Environment Ontology (EnvO) project (http://environmentontology.org). It is also working on a minimum list of habitat terms based on EnvO.
Towards a respository of MIGS/MIMS compliant reports: The Genome Catalogue
The GSC is working with the wider community to promote the completion and publication of MIGS/MIMS compliant information. To help do this, the GSC's has created the Genome Catalogue. The GSC's Genome Catalgoue uses GCDML and the Genomic Rosetta Stone to import read-only information about genomes and provides web forms for capturing MIGS/MIMS compliant information. It was originally developed to help prototype the MIGS/MIMS specification and capture some case study data but it now being re-developed to capture real content. The GenCat software developed to create the Genome Catalogue is freely available. The GSC encourages submission of contextual data to the Genome Catalogue as well as harvesting of data (available in XML through web services).
GSC Linkages to other groups
The GSC is interested in making and building links with other communities - Click on the "Related Projects" link on the main menu of the [http://gensc.org GSC Homepage] ] . As stated above, the GSC is engaged in ontology development within the [http://obofoundry.org OBO Foundry] . The GSC is also a founding member community of the [http://mibbi.sf.net Minimum Information about a Biomedical or Biological Investigation (MIBBI)] , an umbrella community for supporting and co-ordinating the development of checklists describing
Minimum Information Standards .GSC Publications
The GSC maintains a list of publications in its wiki - [http://gensc.org/gc_wiki/index.php/GSC_Publications GSC Publications] . This list includes reports from all workshops, articles from the special issue of the journal OMICS on data standards, and the publication describing the MIGS/MIMS specification in the journal "Nature Biotechnology" (May 2008).
Wikimedia Foundation. 2010.