Phylogenetic profiling

Phylogenetic profiling

Phylogenetic profiling is an important and elegant bioinformatics technique in which the joint presence or joint absence of two traits across large numbers of species is used to infer a meaningful biological connection, such as involvement of two different proteins in the same biological pathway. Along with examination of conserved synteny, conserved operon structure, or "Rosetta Stone" domain fusions, comparing phylogenetic profiles is a designated a "post-homology" technique, in that the computation essential to method begins after it is determined which proteins are homologous to which. A number of these techniques were developed by David Eisenberg and colleagues; phylogenetic profile comparison was introduced in 1999 by Pellegrini, "et al." [Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. Proc Natl Acad Sci U S A. 1999 Apr 13;96(8):4285-8.]


Over 500 species of Bacteria, Archaea, and Eukaryotes now are represented by complete DNA genome sequences. Typically, each gene in genome encodes a protein that can be assigned to a particular protein family on the basis of homology. For a given protein family, its presence or absence in each genome (in the original formulation) is represented by 1 (present) and 0 (absent). Consequently, the phylogenetic distribution of the protein family can be represented by a long binary number with a digit for each genome; such binary representations readily are compared with each other to look for correlated phylogenetic distributions. The large number of complete genomes makes these profiles rich in information. The advantage of using only complete genomes is that the 0 values, representing the absence of a trait, tend to be reliable.


Closely related species should be expected to have very similar sets of genes. However, changes accumulate between more distantly related species by processes that include horizontal gene transfer and gene loss. Individual proteins have specific molecular functions, such as carrying out a single enzymatic reaction or serving as one subunit of a larger protein complex. A biological process such as photosynthesis, methanogenesis, or histidine biosynthesis may require the concerted action of many proteins. If some protein critical to a process is lost, other proteins dedicated to that process would become useless; natural selection makes it unlikely they will be retained over evolutionary time. Therefore, should two different protein families tend always to be either both present or both absent, a likely hypothesis is that the two proteins cooperate in some biological process.

Advances and Challenges

Phylogenetic profiling has led to numerous discoveries in biology, including previously unknown enzymes in metabolic pathways, transcription factors that bind to conserved regulatory sites, and explanations for roles of certain mutations in human disease [Kensche PR, van Noort V, Dutilh BE, Huynen MA. J R Soc Interface. 2008 Feb 6;5(19):151-70.] . Improving the method itself is an active area of scientific research because the method itself faces several limitations. First, co-occurrence of two protein families often represents recent common ancestry of two species rather than a conserved functional relationship; disambiguating these two sources of correlation may require improved statistical methods. Second, proteins grouped as homologs may differ in function, or proteins conserved in function may fail to register as homologs; improved methods for tailoring the size of each protein family to reflect functional conservation will lead to improved results.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • DNA profiling — Not to be confused with Full genome sequencing. Forensic science …   Wikipedia

  • Protein-protein interaction prediction — is a field combining bioinformatics and structural biology in an attempt to identify and catalog interactions between pairs or groups of proteins. Understanding protein protein interactions is important in investigating intracellular signaling… …   Wikipedia

  • Mutual information — Individual (H(X),H(Y)), joint (H(X,Y)), and conditional entropies for a pair of correlated subsystems X,Y with mutual information I(X; Y). In probability theory and information theory, the mutual information (sometimes known by the archaic term… …   Wikipedia

  • филогенетическое профилирование — Метод прогнозирования функций белков, основанный на сравнении со сходными по строению белками других родственных видов [ Eng Rus.pdf] Тематики биотехнологии EN phylogenetic profiling …   Справочник технического переводчика

  • MicroRNA — The stem loop secondary structure of a pre microRNA from Brassica oleracea. A microRNA (abbreviated miRNA) is a short ribonucleic acid (RNA) molecule found in eukaryotic cells. A microRNA molecule has very few nucleotides (an average of 22)… …   Wikipedia

  • Race (classification of humans) — Race Classification Race (classification of humans) Genetics …   Wikipedia

  • Life Sciences — ▪ 2009 Introduction Zoology       In 2008 several zoological studies provided new insights into how species life history traits (such as the timing of reproduction or the length of life of adult individuals) are derived in part as responses to… …   Universalium

  • Metagenomics — is the study of metagenomes, genetic material recovered directly from environmental samples. The broad field may also be referred to as environmental genomics, ecogenomics or community genomics. Traditional microbiology and microbial genome… …   Wikipedia

  • DNA — For a non technical introduction to the topic, see Introduction to genetics. For other uses, see DNA (disambiguation). The structure of the DNA double helix. The atoms in the structure are colour coded by element and the detailed structure of two …   Wikipedia

  • Alan W. Meerow — es un botánico estadounidense, nacido en New York en 1952 y especializado en la taxonomía de la familia de las Amarilidáceas. Estudió en el Bronx High School of Science y luego en la Universidad de Nueva York durante un año. Abandonó la carrera… …   Wikipedia Español

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”