Statistical coupling analysis

Statistical coupling analysis

Statistical coupling analysis or SCA is a technique used in bioinformatics to measure covariation between pairs of amino acids in a protein multiple sequence alignment (MSA). More specifically, it quantifies how much the amino acid distribution at some position "i" changes upon a perturbation of the amino acid distribution at another position "j". The resulting statistical coupling energy indicates the degree of evolutionary dependence between the residues, with higher coupling energy corresponding to increased dependence. [cite web | url=http://www.hhmi.swmed.edu/Labs/rr/SCA.html | title=Supplementary Material for 'Evolutionarily conserved networks of residues mediate allosteric communication in proteins.']

Definition of statistical coupling energy

Statistical coupling energy measures how a perturbation of amino acid distribution at one site in an MSA effects the amino acid distribution at another site. For example, consider a multiple sequence alignment with sites (or columns) "a" through "z", where each site has some distribution of amino acids. At position "i", 60% of the sequences have a valine and the remaining 40% of sequences have a leucine, at position "j" the distribution is 40% isoleucine, 40% histidine and 20% methionine, "k" has an average distribution (the 20 amino acids are present at roughly the same frequencies seen in all proteins), and "l" has 80% histidine, 20% valine. Since positions "i", "j" and "l" have an amino acid distribution different from the mean distribution observed in all proteins, they are said to have some degree of conservation.

In statistical coupling analysis, the conservation (ΔGstat) at each site ("i") is defined as: Delta G_i^{stat} = sqrt{sum_x (ln P_i^x)^2}. [cite journal | title=A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments | author=Dekker et al | journal=Bioinformatics | volume=20 | issue=10 | pages=1565-1572 | year=2004 | url=http://bioinformatics.oxfordjournals.org/cgi/reprint/20/10/1565 | doi=10.1093/bioinformatics/bth128]

Here, Pix describes the probability of finding amino acid "x" at position "i", and is defined by a function in binomial form as follows:

P_i^x = frac{N!}{n_x!(N - n_x)!}p_x^{n_x}(1 - p_x)^{N - n_x},
where N is 100, nx is the percentage of sequences with residue "x" (e.g. methionine) at position "i", and px corresponds to the approximate distribution of amino acid "x" in all positions among all sequenced proteins. The summation runs over all 20 amino acids. After ΔGistat is computed, the conservation for position "i" in a subalignment produced after a perturbation of amino acid distribution at "j" (ΔGi | δjstat) is taken. Statistical coupling energy, denoted ΔΔGi, jstat, is simply the difference between these two values. That is:

DeltaDelta G_{i, j}^{stat} = Delta G_{i | delta j}^{stat} - Delta G_i^{stat}, or, more commonly, DeltaDelta G_{i, j}^{stat} = sqrt{sum_x (ln P_{i|delta j}^x - ln P_i^x)^2}

Statistical coupling energy is often systematically calculated between a fixed, perturbated position, and all other positions in an MSA. Continuing with the example MSA from the beginning of the section, consider a perturbation at position "j" where the amino distribution changes from 40% I, 40% H, 20% M to 100% I. If, in a subsequent subalignment, this changes the distribution at "i" from 60% V, 40% L to 90% V , 10% L, but does not change the distribution at position "l", then there would be some amount of statistical coupling energy between "i" and "j" but none between "l" and "j".

Applications of SCA

Ranganathan and Lockless originally developed SCA to examine thermodynamic (energetic) coupling of residue pairs in proteins. [cite journal | title=Evolutionarily conserved pathways of energetic connectivity in protein families | author=Lockless SW, Ranaganathan R | journal=Science | volume=286 | pages=295–299 | year=1999 | url=http://www.sciencemag.org/cgi/content/full/286/5438/295 | doi=10.1126/science.286.5438.295] Using the PDZ domain family, they were able to identify a small network of residues that were energetically coupled to a binding site residue. The network consisted of both residues spatially close to the binding site in the tertiary fold, called contact pairs, and more distant residues that participate in longer-range energetic interactions. Later applications of SCA by the [http://www.hhmi.swmed.edu/Labs/rr/world/people.html Ranganathan group] on the GPCR, serine protease and hemoglobin families also showed energetic coupling in sparse networks of residues that cooperate in allosteric communication. [cite journal | title=Evolutionarily conserved networks of residues mediate allosteric communication in proteins. | author=Suel et al. | journal=Nature Structural Biology | volume=10 | issue=1 | pages=59-69 | year=2003 | url=http://www.nature.com/nsmb/journal/v10/n1/full/nsb881.html | doi=10.1038/nsb881]

Statistical coupling analysis has also been used as a basis for computational protein design. In 2005, Socolich et al. [cite journal | title=Evolutionary information for specifying a protein fold | author=Socolich et al. | journal=Nature | volume=437 | pages=512-518 | year=2005 | url=http://www.nature.com/nature/journal/v437/n7058/full/nature03991.html | doi=10.1038/nature03991] used an SCA for the WW domain to create artificial proteins with similar thermodynamic stability and structure to natural WW domains. The fact that 12 out of the 43 designed proteins with the same SCA profile as natural WW domains properly folded provided strong evidence that little information -- only coupling information -- was required for specifying the protein fold. This support for the SCA hypothesis was made more compelling considering that a) the successfully folded proteins had only 36% average sequence identity to natural WW folds, and b) none of the artificial proteins designed without coupling information folded properly. An accompanying study showed that the artificial WW domains were functionally similar to natural WW domains in ligand binding affinity and specificity. [cite journal | title=Natural-like function in artificial WW domains | author=Russ et al. | journal=Nature | volume=437 | pages=579-583 | year=2005 | url=http://www.nature.com/nature/journal/v437/n7058/full/nature03990.html | doi: 10.1038/nature03990]

In "de novo" protein structure prediction, it has been shown that, when combined with a simple residue-residue distance metric, SCA-based scoring can fairly accurately distinguish native from non-native protein folds. [cite journal | title=Using scores derived from statistical coupling analysis to distinguish correct and incorrect folds in de-novo protein structure prediction. | author=Bartlett GJ, Taylor WR | journal=Proteins | volume=71 | issue=1 | pages=950-959 | year=2008 | url=http://www3.interscience.wiley.com/cgi-bin/fulltext/116842426/HTMLSTART | doi=10.1002/prot.21779]

ee also

Mutual information

External links

* [http://www.bork.embl-heidelberg.de/Modules/ww_summary.html What is a WW domain?]
* [http://esmane.physics.lsa.umich.edu/wl/external/ICSB/2005/20051021-umwlap001-02-ranganathan-movies/realaudio/f001.htm Ranganathan lecture on statistical coupling analysis (audio included)]
* [http://www.pandasthumb.org/archives/2005/10/protein-folding.html Protein folding — a step closer?] - A summary of the Ranganathan lab's SCA-based design of artificial yet functional WW domains.

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Statistical Energy Analysis — or SEA is a method for predicting vibration transmission in dynamical systems made of coupled acoustic cavities and structures. The vibrational behavior of the system is described in the energy domain. The energy is here the potential and kinetic …   Wikipedia

  • Homologous recombination — Figure 1. During meiosis, homologous recombination can produce new combinations of genes as shown here between similar but not identical copies of human chromosome 1. Homologous recombination is a type of genetic recombination in which nucleotide …   Wikipedia

  • SCA — may refer to: US Government * Stored Communications Act * Electronic Communications Privacy Act Retail Stores: *Super Cheap AutoOrganizations: *Scottish Canoe Association *Secular Coalition for America *Sexual Compulsives Anonymous *Société en… …   Wikipedia

  • Ising model — The Ising model, named after the physicist Ernst Ising, is a mathematical model in statistical mechanics. It has since been used to model diverse phenomena in which bits of information, interacting in pairs, produce collectiveeffects.Definition… …   Wikipedia

  • cosmos — /koz meuhs, mohs/, n., pl. cosmos, cosmoses for 2, 4. 1. the world or universe regarded as an orderly, harmonious system. 2. a complete, orderly, harmonious system. 3. order; harmony. 4. any composite plant of the genus Cosmos, of tropical… …   Universalium

  • Renormalization — Quantum field theory (Feynman diagram) …   Wikipedia

  • Scale invariance — In physics and mathematics, scale invariance is a feature of objects or laws that do not change if length scales (or energy scales) are multiplied by a common factor. The technical term for this transformation is a dilatation (also known as… …   Wikipedia

  • Список награждённых Национальной медалью науки США — Джошуа Ледерберг (справа) получает Национальную медаль науки из рук Президента США Джорджа Буша старшего Список …   Википедия

  • spectroscopy — spectroscopist /spek tros keuh pist/, n. /spek tros keuh pee, spek treuh skoh pee/, n. the science that deals with the use of the spectroscope and with spectrum analysis. [1865 70; SPECTRO + SCOPY] * * * Branch of analysis devoted to identifying… …   Universalium

  • Polymer field theory — A polymer field theory within the framework of statistical mechanics is a statistical field theory, describing the statistical behavior of a neutral or charged polymer system within the field theoretic approach.It can be derived by transforming… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”