- Protein Data Bank
The Protein Data Bank (PDB) is a repository for 3-D structural data of
protein s andnucleic acid s. These data, typically obtained byX-ray crystallography or NMR spectroscopy and submitted bybiologist s and biochemists from around the world, are released into thepublic domain , and can be accessed for free. See alsoprotein structure .History
Founded in 1971 by Drs. Edgar Meyer and Walter Hamilton
Brookhaven National Laboratory , management of the Protein Data Bank was headed by Tom Koestle until 1994 and then byJoel Sussman till 1999, when it was transferred to members of the [http://home.rcsb.org/ Research Collaboratory for Structural Bioinformatics (RCSB)] .Rutgers University is the lead site and is currently under the direction ofHelen M. Berman . [ [http://rutchem.rutgers.edu/content_dynamic/faculty/helen_m_berman.shtml Helen M. Berman - Rutgers Chemistry and Chemical Biology Department ] ]The [http://www.wwpdb.org Worldwide Protein Data Bank] (wwPDB) consists of organizations that act as deposition, data processing and distribution centers for
PDB data. The founding members are [http://www.rcsb.org/pdb RCSB PDB (USA)] , [http://www.ebi.ac.uk/msd MSD-EBI (Europe)] and [http://www.pdbj.org PDBj (Japan)] . The [http://www.bmrb.wisc.edu/ BMRB (USA)] group joined the wwPDB in 2006. The mission of the wwPDB is to maintain a single Protein Data Bank Archive ofmacromolecular structural data that is freely and publicly available to the global community.The PDB is a key resource in
structural biology and is critical to more recent work instructural genomics .Countless derived databases and projects have been developed to integrate and classify the PDB in terms of
protein structure ,protein function andprotein evolution .Growth
When the PDB was originally founded it contained just 7 protein structures. Since then it has undergone an approximate exponential growth in the number of structures, which does not show any sign of falling off.
The [http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100 growth rate] of the PDB has been the subject of fairly extensive analysis.
Contents
As of
24 June ,2008 , the database contained 51,491 released atomic coordinate entries (or "structures"), 47,526 of that proteins, the rest being nucleic acids, nucleic acid-protein complexes, and a few other molecules. About 5,000 new structures are released each year. Data are stored in themmCIF format specifically developed for the purpose. It is estimated that the size of the PDB archive will triple to 150,000 structures by the year 2014. [cite web|title=PDB Archive Contains More Than 50,000 Structures |url=http://www.rcsb.org/pdb/static.do?p=general_information/news_publications/news/news_2008.html#20080408 |accessdate=2008-04-12 |date|2008-04-08]Note that the database stores information about the exact location of all
atom s in a large biomolecule (although, usually without thehydrogen atoms, as their positions are more of a statistical estimate); if one is only interested in "sequence data", i.e., the list ofamino acid s making up a particularprotein or the list ofnucleotide s making up a particularnucleic acid , the much larger databases fromSwiss-Prot and theInternational Nucleotide Sequence Database Collaboration should be used.tatistics
As of
19 August ,2008 , the "PDB Holdings List" at [http://www.rcsb.org/pdb/statistics/holdings.do RCSB] reported the following statistics:Note that theoretical models are no longer accepted in the PDB.33,017 structures in the PDB have a
structure factor file. 4,054 structures in the PDB have an NMR restraint file.The current breakdown of holdings is updated [http://www.rcsb.org/pdb/statistics/holdings.do weekly] .
File format
Through the years the PDB file format has undergone many, many changes and revisions. Its original format was dictated by the width of computer punch cards.
* [http://www.wwpdb.org/documentation/format23/v2.3.html PDB Format Guide - Prepared by the PDB Staff at BNL] The PDB format specification can be found here, and it is vital that you read this before looking at the raw data.
*Recently PDB provides a representation of PDB data in XML format, [http://pdbml.rcsb.org/ PDBML] format.
* [ftp://ftp.wwpdb.org/pub/pdb/ ftp.wwpdb.org] The raw data can be downloaded from here.
* PDB format files can be downloaded using HTTP with URLs like this: [http://www.pdb.org/pdb/files/4hhb.pdb.gz http://www.pdb.org/pdb/files/4hhb.pdb.gz]
* PDBML (XML) files can be downloaded using HTTP with URLs like this: [http://www.pdb.org/pdb/files/4hhb.xml.gz http://www.pdb.org/pdb/files/4hhb.xml.gz]
* [ftp://ftp.ebi.ac.uk/pub/databases/rcsb ftp.ebi.ac.uk/pub/databases/rcsb/] Alternate download location for the PDB archive.
* [http://www.rcsb.org/pdb/static.do?p=general_information/pdb_statistics/index.html&tb=false www.pdb.org] Statistics about the PDB can be found here.This legacy format has caused many problems with the format, and consequently there are 'clean-up' projects;
* [http://www.ncbi.nih.gov/Structure/MMDB/mmdb.shtml The Molecular Modeling DataBase (MMDB)] from NCBI
* [http://www.wwpdb.org wwPDB]The MMDB uses ASN.1 (and an XML conversion of this format). The wwPDB members RCSB PDB, MSD-EBI, and PDBj are working together to make the data uniform across the archive. Some believe this to be desirable; others argue that, without a universal repository of information (i.e., a common dictionary), it is not possible to draw comparisons.Fact|date=June 2007
Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID. This should not be used as an identifier for biomolecules, since often several structures for the same molecule (in different environments or conformations) are contained in PDB with different PDB IDs.
If a biologist submits structure data for a protein or nucleic acid, wwPDB staff reviews and annotates the entry. The data are then automatically checked for plausibility. The
source code for this validation software has been released for free. The main data base accepts only experimentally derived structures, and not theoretically predicted ones (seeprotein structure prediction ).Various funding agencies and scientific journals now require scientists to submit their structure data to PDB.
Viewing the data
The structural data can be used to visualize the
biomolecule s with appropriate software, such as VMD,RasMol ,PyMOL ,Jmol ,MDL Chime ,QuteMol ,web browser VRML plugin or any web-based software designed to visualize and analyse the protein structures such asSTING . A recent desktop software addition is Sirius. The RCSB PDB website also contains resources for education, structural genomics, and related software.References
Printed
*cite journal |author=Berman H, Henrick K, Nakamura H |title=Announcing the worldwide Protein Data Bank |journal=Nat. Struct. Biol. |volume=10 |issue=12 |pages=980 |year=2003 |month=Dec |pmid=14634627 |doi=10.1038/nsb1203-980 |url=
*cite journal |author=Berman HM, Westbrook J, Feng Z, "et al" |title=The Protein Data Bank |journal=Nucleic Acids Res. |volume=28 |issue=1 |pages=235–42 |year=2000 |month=Jan |pmid=10592235 |pmc=102472 |doi= |url=http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=10592235
*
*cite journal |author=Meyer EF |title=The first years of the Protein Data Bank |journal=Protein Sci. |volume=6 |issue=7 |pages=1591–7 |year=1997 |month=Jul |pmid=9232661 |pmc=2143743 |doi= |url=http://www.proteinscience.org/cgi/pmidlookup?view=long&pmid=9232661
*cite journal |author=Sussman JL, Lin D, Jiang J, "et al" |title=Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules |journal=Acta Crystallogr. D Biol. Crystallogr. |volume=54 |issue=Pt 6 Pt 1 |pages=1078–84 |year=1998 |month=Nov |pmid=10089483 |doi= |url=ee also
Crystallographic database External links
* [http://www.wwpdb.org/ The Worldwide Protein Data Bank (wwPDB)] — parent site to regional hosts (below)
* [http://www.pdb.org RCSB Protein Data Bank] - home page
* [http://www.ebi.ac.uk/msd MSD-EBI] - home page
* [http://www.pdbj.org Protein Data Bank Japan] - home page
* The [http://pdbwiki.org/index.php/PDB_FAQ PDB FAQ] - frequently asked questions about the PDB and working with structural models.Related (derived) resources
*
Macromolecular Structure Database — [http://www.ebi.ac.uk/msd/ MSD Home Page] - project for data about macromolecular structures from the PDB.
*PDBWiki — [http://PDBWiki.Org PDBWiki Home Page] - a website for community annotation of PDB structures.
*PDBsum — [http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/ PDBsum Home Page] - an overview of macromolecular structures in the PDB.
*Proteopedia — [http://www.proteopedia.org Proteopedia Home Page] The collaborative, 3D encyclopedia of proteins and other moleculesEnzyme database data
* [ftp://ftp.ebi.ac.uk/pub/contrib/mcneil/pdb_sws_mapping.lst.gz EBI] . The best mapping is provided by Kim Henrick's group at EBI as part of the [http://www.ebi.ac.uk/msd MSD] [http://www.ebi.ac.uk/msd-srv/docs/sifts/ SIFTS initiative] .
* [http://www.rcsb.org/pdb/pdb_news2003.html#ec_query PDB provide a mapping on their beta site] , but it is at the whole PDB level not chain level.
* [http://www.brenda.uni-koeln.de/index.php4?page=/php/search_result.php4?a=23 Search at BRENDA]enzyme database portal.
* [http://www.bioinf.org.uk/pdbsprotec/ PDBSProtEC]Molecular graphic visualisation tools
*
PyMOL — [http://pymol.sourceforge.net/ PyMol Home Page]
* Sirius — [http://sirius.sdsc.edu Sirius Home Page]
*STING — [http://www.cbi.cnptia.embrapa.br/SMS STING Home Page]
*RasMol — [http://www.openrasmol.org/ RasMol Home Page]
* [http://garlic.mefos.hr/garlic/ Garlic]
* [http://www.expasy.ch/spdbv/mainpage.html Swiss-PDB Viewer]
* Jmol Viewer — [http://www.jmol.org/ Jmol Home Page] Open Source, Java based interactive molecular viewer
*QuteMol — [http://qutemol.sourceforge.net QuteMol Home Page] Open Source, Win & Mac, high quality interactive molecular viewer
* [http://web.mit.edu/star/biochem StarBiochem] Java based interactive molecular viewer with integrated search of protein databank
* [http://www.webreader.net/animations.htm S-Animations] PC based, easy to use PDB Viewer with gravity and charge simulator
Wikimedia Foundation. 2010.