Computational genomics

Computational genomics

Computational genomics refers to the use of computational analysis to decipher biology from genome sequences and related data [1], including both DNA and RNA sequence as well as other "post-genomic" data (i.e. experimental data obtained with technologies that require the genome sequence, such as genomic DNA microarrays). As such, computational genomics may be regarded as a subset of bioinformatics, but with a focus on using whole genomes (rather than individual genes) to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.[2]

Contents

History

The roots of computational genomics are shared with those of bioinformatics. During the 1960s, Margaret Dayhoff and others at the National Biomedical Research Foundation assembled databases of homologous protein sequences for evolutionary study.[3] Their research developed a phylogenetic tree that determined the evolutionary changes that were required for a particular protein to change into another protein based on the underlying amino acid sequences. This led them to create a scoring matrix that assessed the likelihood of one protein being related to another.

Beginning in the 1980s, databases of genome sequences began to be recorded, but this presented new challenges in the form of searching and comparing the databases of gene information. Unlike text-searching algorithms that are used on websites such as google or Wikipedia, searching for sections of genetic similarity requires one to find strings that are not simply identical, but similar. This led to the development of the Needleman-Wunsch algorithm, which is a dynamic programming algorithm for comparing sets of amino acid sequences with each other by using scoring matrices derived from the earlier research by Dayhoff. Later, the BLAST algorithm was developed for performing fast, optimized searches of gene sequence databases. BLAST and its derivatives are probably the most widely-used algorithms for this purpose.[4]

The emergence of the phrase "computational genomics" coincides with the availability of complete sequenced genomes in the mid-to-late 1990s. The first meeting of the Annual Conference on Computational Genomics was organized by scientists from The Institute for Genomic Research (TIGR) in 1998, providing a forum for this speciality and effectively distinguishing this area of science from the more general fields of Genomics or Computational Biology.[5][6] The first use of this term in scientific literature, according to MEDLINE abstracts, was just one year earlier in Nucleic Acids Research.[7]. The final Computational Genomics conference was held in 2006, featuring a keynote talk by Nobel Laureate Barry Marshall, co-discoverer of the link between Helicobacter pylori and stomach ulcers. As of 2010, the leading conferences in the field include Intelligent Systems for Molecular Biology (ISMB), RECOMB, and the Cold Spring Harbor Laboratory and Sanger Institute's meetings titled "Biology of Genomes" and "Genome Informatics".

The development of computer-assisted mathematics (using products such as Mathematica or Matlab) has helped engineers, mathematicians and computer scientists to start operating in this domain, and a public collection of case studies and demonstrations is growing, ranging from whole genome comparisons to gene expression analysis.[8]. This has increased the introduction of different ideas, including concepts from systems and control, information theory, strings analysis and data mining. It is anticipated that computational approaches will become and remain a standard topic for research and teaching, while students fluent in both topics start being formed in the multiple courses created in the past few years.

Contributions of computational genomics research to biology

Contributions of computational genomics research to biology include [2]:

  • discovering subtle patterns in genomic sequences
  • proposing cellular signalling networks
  • proposing mechanisms of genome evolution
  • predict precise locations of all human genes using comparative genomics techniques with several mammalian and vertebrate species
  • predict conserved genomic regions that are related to early embryonic development
  • discover potential links between repeated sequence motifs and tissue-specific gene expression
  • measure regions of genomes that have undergone unusually rapid evolution

See also

References

  1. ^ Koonin EV (2001) Computational Genomics, National Center for Biotechnology Information, National Library of Medicine, NIH (PubMed ID: 11267880)
  2. ^ a b Computational Genomics and Proteomics at MIT
  3. ^ David Mount (2000), Bioinformatics, Sequence and Genome Analysis, pp. 2-3, Cold Spring Harbor Laboratory Press, ISBN 0-87969-597-8
  4. ^ T.A. Brown (1999), Genomes, John Wiley & Sons, ISBN 0-471-31618-0
  5. ^ [backPid]=67&cHash=fd69079f5e The 7th Annual Conference on Computational Genomics (2004)
  6. ^ The 9th Annual Conference on Computational Genomics (2006)
  7. ^ A. Wagner (1997), A computational genomics approach to the identification of gene networks, Nucleic Acids Res., Sep 15;25(18):3594-604, ISSN 0305-1048
  8. ^ Cristianini, N. and Hahn, M. Introduction to Computational Genomics, Cambridge University Press, 2006. (ISBN 9780521671910 | ISBN 0521671914)

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Computational biology — involves the development and application of data analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.[1] The field is widely defined and… …   Wikipedia

  • Computational — may refer to: Computer Computational algebra Computational Aeroacoustics Computational and Information Systems Laboratory Computational and Systems Neuroscience Computational archaeology Computational auditory scene analysis Computational biology …   Wikipedia

  • Genomics — This article is about the scientific field. For the journal, see Genomics (journal). Genomics is a discipline in genetics concerning the study of the genomes of organisms. The field includes intensive efforts to determine the entire DNA sequence… …   Wikipedia

  • Computational Resource for Drug Discovery (CRDD) — Computational Resources for Drug Discovery (CRDD) is one of the important silico modules of Open Source for Drug Discovery (OSDD). The CRDD web portal provides computer resources related to drug discovery on a single platform. Following are major …   Wikipedia

  • Computational systems biology — is the algorithm and application development arm of systems biology. It is also directly associated with bioinformatics and computational biology. Computational systems biology aims to develop and use efficient algorithms, data structures and… …   Wikipedia

  • Computational immunology — In academia, computational immunology is a field of science that encompasses high throughput genomic and bioinformatics approaches to immunology. The field s main aim is to convert immunological data into computational problems, solve these… …   Wikipedia

  • Comparative genomics — is the study of the relationship of genome structure and function across different biological species or strains. Comparative genomics is an attempt to take advantage of the information provided by the signatures of selection to understand the… …   Wikipedia

  • Structural genomics — consists in the determination of the three dimensional structure of all proteins of a given organism, by experimental methods such as X ray crystallography, NMR spectroscopy or computational approaches such as homology modelling.As opposed to… …   Wikipedia

  • Synthetic genomics — is a nascent field of synthetic biology that uses aspects of genetic modification on pre existing life forms with the intent of producing some product or desired behavior on the part of the life form so created. Synthetic genomics is unlike… …   Wikipedia

  • Galaxy (computational biology) — Galaxy is a web application developed at Pennsylvania State University, designed for analysis of genomics data. The Galaxy web application is designed to help two communities that rarely talk to each other, experimental biologists who have little …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”