Microsatellite (genetics)

Microsatellite (genetics)

Microsatellites, also known as Simple Sequence Repeats (SSRs) or short tandem repeats (STRs), are repeating sequences of 2-6 base pairs of DNA.[1]

Microsatellites are typically co-dominant. They are used as molecular markers in genetics, for kinship, population and other studies. They can also be used to study gene duplication or deletion. Microsatellites are also known to be causative agents in human disease, especially neurodegenerative disorders and cancer.



One common example of a microsatellite is a (CA)n repeat, where n varies between alleles. These markers often present high levels of inter- and intra-specific polymorphism, particularly when the number of repetitions is 10 or greater.[2] The repeated sequence is often simple, consisting of two, three or four nucleotides (di-, tri-, and tetranucleotide repeats respectively), and can be repeated 3 to 100 times, with the longer loci generally having more alleles due to the greater potential for slippage (see below).[3] CA nucleotide repeats are very frequent in human and other genomes, and are present every few thousand base pairs. As there are often many alleles present at a microsatellite locus, genotypes within pedigrees are often fully informative, in that the progenitor of a particular allele can often be identified. In this way, microsatellites are ideal for determining paternity, population genetic studies and recombination mapping. It is also the only molecular marker to provide clues about which alleles are more closely related.[4] Microsatellites are also predictors of SNP density and human–chimpanzee divergence differing from the genome-wide average in regions extending thousands of nucleotides.[5]

The variability of microsatellites is due to a higher rate of mutation compared to other neutral regions of DNA. These high rates of mutation can be explained most frequently by slipped strand mispairing (slippage) during DNA replication on a single DNA strand. Mutation may also occur during recombination during meiosis.[6] Some errors in slippage are rectified by proofreading mechanisms within the nucleus, but some mutations can escape repair. The size of the repeat unit, the number of repeats and the presence of variant repeats are all factors, as well as the frequency of transcription in the area of the DNA repeat. Interruption of microsatellites, perhaps due to mutation, can result in reduced polymorphism. However, this same mechanism can occasionally lead to incorrect amplification of microsatellites; if slippage occurs early on during PCR, microsatellites of incorrect lengths can be amplified.

Analysis of Microsatellites


Microsatellites can be amplified for identification by the polymerase chain reaction (PCR) process, using the unique sequences of flanking regions as primers. DNA is repeatedly denatured at a high temperature to separate the double strand, then cooled to allow annealing of primers and the extension of nucleotide sequences through the microsatellite. This process results in production of enough DNA to be visible on agarose or polyacrylamide gels; only small amounts of DNA are needed for amplification because in this way thermocycling creates an exponential increase in the replicated segment.[7] With the abundance of PCR technology, primers that flank microsatellite loci are simple and quick to use, but the development of correctly functioning primers is often a tedious and costly process.

A number of DNA samples from specimens of Littorina plena amplified using polymerase chain reaction with primers targeting a variable simple sequence repeat (SSR, a.k.a. microsatellite) locus. Samples have been run on a 5% polyacrylamide gel and visualized using silver staining.

Creation of microsatellite primers

If searching for microsatellite markers in specific regions of a genome, for example within a particular exon of a gene, primers can be designed manually. This involves searching the genomic DNA sequence for microsatellite repeats, which can be done by eye or by using automated tools such as repeat masker. Once the potentially useful microsatellites are determined (removing non-useful ones such as those with random inserts within the repeat region), the flanking sequences can be used to design oligonucleotide primers which will amplify the specific microsatellite repeat in a PCR reaction.

Random microsatellite primers can be developed by cloning random segments of DNA from the focal species. These random segments are inserted into a plasmid or bacteriophage vector, which is in turn implanted into Escherichia coli bacteria. Colonies are then developed, and screened with fluorescently–labelled oligonucleotide sequences that will hybridize to a microsatellite repeat, if present on the DNA segment. If positive clones can be obtained from this procedure, the DNA is sequenced and PCR primers are chosen from sequences flanking such regions to determine a specific locus. This process involves significant trial and error on the part of researchers, as microsatellite repeat sequences must be predicted and primers that are randomly isolated may not display significant polymorphism.[2][8] Microsatellite loci are widely distributed throughout the genome and can be isolated from semi-degraded DNA of older specimens, as all that is needed is a suitable substrate for amplification through PCR.

More recent techniques involve using oligonucleotide sequences consisting of repeats complementary to repeats in the microsatellite to "enrich" the DNA extracted (Microsatellite enrichment). The oligonucleotide probe hybridizes with the repeat in the microsatellite, and the probe/microsatellite complex is then pulled out of solution. The enriched DNA is then cloned as normal, but the proportion of successes will now be much higher, drastically reducing the time required to develop the regions for use. However, which probes to use can be a trial and error process in itself.[9]


ISSR (for inter-simple sequence repeat) is a general term for a genome region between microsatellite loci. The complementary sequences to two neighboring microsatellites are used as PCR primers; the variable region between them gets amplified. The limited length of amplification cycles during PCR prevents excessive replication of overly long contiguous DNA sequences, so the result will be a mix of a variety of amplified DNA strands which are generally short but vary much in length.

Sequences amplified by ISSR-PCR can be used for DNA fingerprinting. Since an ISSR may be a conserved or nonconserved region, this technique is not useful for distinguishing individuals, but rather for phylogeography analyses or maybe delimiting species; sequence diversity is lower than in SSR-PCR, but still higher than in actual gene sequences. In addition, microsatellite sequencing and ISSR sequencing are mutually assisting, as one produces primers for the other.

Global Microsatellite Content with microarrays

Using a CGH-style array manufactured by Nimblgen/Roche the entire microsatellite content of a genome can be measured quickly, inexpensively and en masse. It is important to note that this approach does not evaluate the genotype of any particular locus, but instead sums the contributions for a given repeated motif from the many positions in which that motif exists across the genome. This array evaluates all 1- to 6- mer repeats (and their cyclic permutations and complement). This approach has been used to place any species, sequenced or not, onto a taxonomic tree. That tree matched precisely the currently accepted phylogenic relationships. With this new platform technology it is possible to study the genomic variations within an individual for those genomic features that are most variable, microsatellites.

Using this global microsatellite content array approach, studies indicate that there are major new genomic destabilization mechanisms that globally modify microsatellites, thus potentially altering very large numbers of genes. These global scale variations in both the tumor and germline patient samples may have important roles in the cancer process, of potential value in diagnosis, prognosis and therapy judgments . This Global Microsatellite Content array revealed that for the cancers studied, especially breast cancer, that there were elevated amounts of AT rich motifs. Pursuit of these AT rich motifs identified an AAAG motif that was variable in region immediately upstream of the start site of the Estrogen Related Receptor Gamma gene, a gene that had previously been implicated in breast cancer and tamoxifen resistance. This locus was found to be a promoter for the gene. A long allele was found to be approximately 3 times more prevalent in breast cancer patients (germline) than in cancer-free patients (p<0.01) and thus may be a risk marker.


Microsatellites have proved to be versatile molecular markers, particularly for population analysis, but they are not without limitations. Microsatellites developed for particular species can often be applied to closely related species, but the percentage of loci that successfully amplify may decrease with increasing genetic distance.[8] Point mutation in the primer annealing sites in such species may lead to the occurrence of ‘null alleles’, where microsatellites fail to amplify in PCR assays.[8][10] Null alleles can be attributed to several phenomena. Sequence divergence in flanking regions can lead to poor primer annealing, especially at the 3’ section, where extension commences; preferential amplification of particular size alleles due to the competitive nature of PCR can lead to heterozygous individuals being scored for homozygosity (partial null). PCR failure may result when particular loci fail to amplify, whereas others amplify more efficiently and may appear homozygous on a gel assay, when they are in reality heterozygous in the genome. Null alleles complicate the interpretation of microsatellite allele frequencies and thus make estimates of relatedness faulty. Furthermore, stochastic effects of sampling that occurs during mating may change allele frequencies in a way that is very similar to the effect of null alleles; an excessive frequency of homozygotes causing deviations from Hardy-Weinberg equilibrium expectations. Since null alleles are a technical problem and sampling effects that occur during mating are a real biological property of a population, it is often very important to distinguish between them if excess homozygotes are observed.

When using microsatellites to compare species, homologous loci may be easily amplified in related species, but the number of loci that amplify successfully during PCR may decrease with increased genetic distance between the species in question. Mutation in microsatellite alleles is biased in the sense that larger alleles contain more bases, and are therefore likely to be mistranslated in DNA replication. Smaller alleles also tend to increase in size, whereas larger alleles tend to decrease in size, as they may be subject to an upper size limit; this constraint has been determined but possible values have not yet been specified. If there is a large size difference between individual alleles, then there may be increased instability during recombination at meiosis.[8] In tumour cells, where controls on replication may be damaged, microsatellites may be gained or lost at an especially high frequency during each round of mitosis. Hence a tumour cell line might show a different genetic fingerprint from that of the host tissue.

Mechanisms for change

The most common cause of length changes in short sequence repeats is replication slippage, caused by mismatches between DNA strands while being replicated during meiosis (Tautz 1994). Typically, slippage in each microsatellite occurs about once per 1,000 generations (Weber 1993). Slippage changes in repetitive DNA are orders of magnitude more common than point mutations in other parts of the genome (Jarne 1996). Most slippage results in a change of just one repeat unit, and slippage rates vary for different repeat unit sizes, and within different species (Kruglyak 1998).

Short sequence repeats are distributed throughout the genome (King 1997). Presumably, their most probable means of expression will vary, depending on their location.

In proteins

In mammals, 20% to 40% of proteins contain repeating sequences of amino acids caused by short sequence repeats (Marcotte 1998). Most of the short sequence repeats within protein-coding portions of the genome have a repeating unit of three nucleotides, since that length will not cause frame-shift mutations (Sutherland 1995). Each trinucleotide repeating sequence is transcribed into a repeating series of the same amino acid. In yeasts, the most common repeated amino acids are glutamine, glutamic acid, asparagine, aspartic acid and serine. These repeating segments can affect the physical and chemical properties of proteins, with the potential for producing gradual and predictable changes in protein action (Hancock 2005).

For example, length changes in tandemly repeating regions in the Runx2 gene lead to differences in facial length in domesticated dogs (Canis familiaris), with an association between longer sequence lengths and longer faces (Fondon 2004). This association also applies to a wider range of Carnivora species (Sears 2007). Length changes in polyalanine tracts within the HoxA13 gene are linked to hand-foot-genital syndrome, a developmental disorder in humans (Utsch 2002). Length changes in other triplet repeats are linked to more than 40 neurological diseases in humans (Pearson 2005).

Evolutionary changes from replication slippage also occur in simpler organisms. For example, microsatellite length changes are common within surface membrane proteins in yeast, providing rapid evolution in cell properties (Bowen 2006). Specifically, length changes in the FLO1 gene control the level of adhesion to substrates (Verstrepen 2005). Short sequence repeats also provide rapid evolutionary change to surface proteins in pathenogenic bacteria, perhaps so they can keep up with immunological changes in their hosts (Moxon 1994). This is known as the Red Queen hypothesis (Van Valen 1973). Length changes in short sequence repeats in a fungus (Neurospora crassa) control the duration of its circadian clock cycles (Michael 2007).

Gene regulation

Length changes of microsatellites within promoters and other cis-regulatory regions can also change gene expression quickly, between generations. The human genome contains many (>16,000) short sequence repeats in regulatory regions, which provide ‘tuning knobs’ on the expression of many genes (Rockman 2002). Length changes in bacterial SSRs can affect fimbriae formation in Haemophilus influenza, by altering promoter spacing (Moxon 1994). Minisatellites are also linked to abundant variations in cis-regulatory control regions in the human genome (Rockman 2002). And microsatellites in control regions of the Vasopressin 1a receptor gene in voles influence their social behavior, and level of monogamy (Hammock 2005).

Within introns

Microsatellites within introns also influence phenotype, through means that are not currently understood. For example, a GAA triplet expansion in the first intron of the X25 gene appears to interfere with transcription, and causes Friedreich Ataxia (Bidichandani 1998). Tandem repeats in the first intron of the Asparagine synthetase gene are linked to acute lymphoblastic leukemia (Akagi 2008). A repeat polymorphism in the fourth intron of the NOS3 gene is linked to hypertension in a Tunisian population (Jemaa 2008). Reduced repeat lengths in the EGFR gene are linked with osteosarcomas (Kersting 2008).

Within transposons

Microsatellites are distributed throughout the genome (Richard 2008). Almost 50% of the human genome is contained in various types of transposable elements (also called transposons, or ‘jumping genes’), and many of them contain repetitive DNA (Scherer 2008). It is probable that short sequence repeats in those locations are also involved in the regulation of gene expression (Tomilin 2008).

See also


  1. ^ Turnpenny, P. & Ellard, S. (2005). Emery's Elements of Medical Genetics, 12th. ed. Elsevier, London. 
  2. ^ a b Queller, D.C., Strassman,,J.E. & Hughes, C.R. (1993). "Microsatellites and Kinship". Trends in Ecology and Evolution 8: 285–288. doi:10.1016/0169-5347(93)90256-O. 
  3. ^ http://www.genetics.org/content/164/2/781.full
  4. ^ D. B. Goldstein, A. R. Linares, L. L. Cavalli-Sforza, and M. W. Feldman (1995). "An Evaluation of Genetic Distances for Use With Microsatellite Loci". Genetics 139 (1): 463–471. PMC 1206344. PMID 7705647. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1206344. 
  5. ^ M.A. Varela and W. Amos (2010). "Heterogeneous distribution of SNPs in the human genome: Microsatellites as predictors of nucleotide diversity and divergence". Genomics 95: 151–159. doi:10.1016/j.ygeno.2009.12.003. PMID 20026267. 
  6. ^ Blouin, M.S., Parsons, M., Lacaille, V. & Lotz, S. (1996). "Use of microsatellite loci to classify individuals by relatedness". Molecular Ecology 5 (3): 393–401. doi:10.1111/j.1365-294X.1996.tb00329.x. PMID 8688959. 
  7. ^ Griffiths, A.J.F., Miller, J.F., Suzuki, D.T., Lewontin, R.C. & Gelbart, W.M. (1996). Introduction to Genetic Analysis, 5th Edition. W.H. Freeman, New York. 
  8. ^ a b c d Jarne, P. & Lagoda, P.J.L. (1996). "Microsatellites, from molecules to populations and back". Trends in Ecology and Evolution 11: 424–429. doi:10.1016/0169-5347(96)10049-5. 
  9. ^ Kaukinen KH, Supernault KJ, and Miller KM (2004). "Enrichment of tetranucleotide microsatellite loci from invertebrate species". Journal of Shellfish Research 23 (2): 621. 
  10. ^ Dakin, EE; Avise, JC (2004). "Microsatellite null alleles in parentage analysis". Heredity 93 (5): 504–509. doi:10.1038/sj.hdy.6800545. PMID 15292911. 
  • Bidichandani, S. I., et al. 1998. The GAA triplet-repeat expansion in Friedreich ataxia interferes with transcription and may be associated with an unusual DNA structure. Am. J. Hum. Genet. 62:111-121.
  • Bowen, S. and Wheals, A. E. 2006. Ser//Thr-rich domains are associated with genetic variation and morphogenesis in Saccharomyces cerevisiae. Yeast 23:633-640.
  • Caporale, L. H. 2003. Natural selection and the emergence of a mutation phenotype: an update of the evolutionary synthesis considering mechanisms that affect genome variation. Ann. Rev. Micro. 57:467-485.
  • Fondon, J. W. III, and Garner, H. R. 2004. Molecular origins of rapid and continuous morphological evolution. Proc. Natl. Acad. Sci. 1010:18058-18063.
  • Hammock, E. A. D. and Young, L. J. 2005. Microsatellite instability generates diversity in brain and sociobehavioral traits. Science 308:1630-1634.
  • Hancock, J. M., and Simon, M. 2005. Simple sequence repeats in proteins and their significance for network evolution. Gene 345:113-118.
  • Jarne, P., and Lagoda, P. J. L. 1996. Microsatellites, from molecules to populations and back. Trends Ecol. Evol. 11:424-429.
  • Jemaa, R., et al. 2008. Association of a 27-bp repeat polymorphism in intron 4 of endothelial constitutive nitric oxide synthase gene with hypertension in a Tunisian population. Clin. Biochem. 42:852-856.
  • Kashi, Y., et al. 1997. Simple sequence repeats as a source of quantitative genetic variation. Trends Gen. 13:74-78.
  • Kersting, C., et al. 2008. Biological importance of a polymorphic CA sequence within intron I of the epidermal growth factor receptor gene (EGFR) in high grade central osteosarcomas. Gene Chrom. & Cancer 47:657-664.
  • King, D. G. 1997. Evolutionary tuning knobs. Endeavor 21:36-40.
  • Kinoshita, Y., et al. 2007. Control of FWA gene silencing in Arabadopsis thaliana by SINE-related direct repeats. Plant. J. 49:38-45.
  • Kruglyak, S., et al. 1998. Equilibrium distributions of microstellite repeat length resulting from a balance between slippage events and point mutations. Proc. Natl. Acad. Sci. 95:10774-10778.
  • Li Y-C., et al. 2002. Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol. Ecol. 11:2453-2465.
  • Li Y-C., et al. 2003. Microsatellites within genes: structure, function and evolution. Mol. Bio. Evol. 21:991-1007.
  • Marcotte, E. M., et al. 1998. A census of protein repeats. J. Mol. Biol. 293:151-160.
  • Mattick, J. S. 2003. Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. Bioessays 25:930-939.
  • Meagher, T. and Vassiliadis, C. 2005. Phenotypic impacts of repetitive DNA in flowering plants. New Phyto. 168:71-80.
  • Michael, T. P., et al. 2008. Simple sequence repeats provide a substrate for phenotypic variation in the Neurospora crassa circadian clock. PLoS One 2:e795.
  • Moxon, E. R., et al. 1994. Adaptive evolution of highly mutable loci in pathogenic bacteria. Curr. Bio. 4:24-32.
  • Müller, K. J., et al. 1995. The barley Hooded mutation caused by a duplication in a homeobox gene intron. Nature 374:727-730.
  • Pearson, C. E., et al. 2005. Repeat instability: mechanisms of dynamic mutations. Nat. Rev. Gen. 6:729-742.
  • Pumpernik, D. et al., Replication slippage versus point mutation rates in short tandem repeats of the human genome. 2008. Mol. Genet. Genomics, 279 (1): 53-61.
  • Richard, G-F., et al. 2008. Comparative genomics and molecular dynamics of DNA repeats in Eukaryotes. Micr. Mol. Bio. Rev. 72:686-727.
  • Rockman, M. V. and Wray, G. A. 2002. Abundant raw material for cis-regulatory evolution in humans. Mol. Biol. Evol. 19:1991-2004.
  • Scherer, S., 2008. A short guide to the human genome. Cold Spring Harbor University Press, Cold Spring NY.
  • Sears, K. E. et al. 2007. The correlated evolution of Runx2 tandem repeats, transcriptional activity, and facial length in Carnivora. Evol. & Dev. 9(6):555-565.
  • Streelman, J. T. and Kocher, T. D. 2002. Microsatellite variation associated with prolactin expression and growth of salt-challenged Tilapia. Phys. Genom. 9:1-4.
  • Sutherland, G. R., and Richards, R. I. 1995. Simple tandem DNA repeats and human genetic disease. Proc. Natl. Acad. Sci 92:3636-3641.
  • Tautz, D. and Schlötterer, C. 1994. Simple sequences. Curr. Opin. Genet. Dev. 4:832-837.
  • Tomilin, N. V. 2008. Regulation of mammalian gene expression by retroelements and non-coding tandem repeats. BioEssays 30:338-348.
  • Utsch, B., et al. 2002. A novel stable stable polyalanine [poly(A)] expansion in the HoxA13 gene associated with hand-foot-genital syndrome: proper function of poly(A)-harbouring transcription factors depends on a critical repeat length? Hum. Gen. 110:488-494.
  • Van Valen, L. 1973. A new evolutionary law. Evol. Theory 1:1-30.
  • Verstrepen, K. J., et al. 2005. Intragenic tandem repeats generate functional variability. Nat. Gen. 37:986-990.
  • Vinces, M. D., et al. 2009. Unstable tandem repeats in promoters confer transcriptional evolvability. Science 324:1213-1216.

External links

Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • microsatellite — microsatellite. См. микросателлит. (Источник: «Англо русский толковый словарь генетических терминов». Арефьев В.А., Лисовенко Л.А., Москва: Изд во ВНИРО, 1995 г.) …   Молекулярная биология и генетика. Толковый словарь.

  • microsatellite DNA — microsatellite DNA. См. микросателлитная ДНК. (Источник: «Англо русский толковый словарь генетических терминов». Арефьев В.А., Лисовенко Л.А., Москва: Изд во ВНИРО, 1995 г.) …   Молекулярная биология и генетика. Толковый словарь.

  • Microsatellite — may refer to: A class of miniaturized satellite Microsatellite (genetics), a repeating sequence in DNA This disambiguation page lists articles associated with the same title. If an internal link led …   Wikipedia

  • Genetics and archaeogenetics of South Asia — The genetics and archaeogenetics of the ethnic groups of South Asia aim at uncovering these groups genetic history. One major issue is the identification of intrusive genetic material identified by some studies (Bamshad et al. (2001), Spencer… …   Wikipedia

  • Microsatellite instability — Microsatellites are repeated sequences of DNA. Although the length of these microsatellites is highly variable from person to person, each individual has microsatellites of a set length. These repeated sequences are common, and normal. The most… …   Wikipedia

  • Microsatellite (biologie) — Pour les articles homonymes, voir Microsatellite. Un microsatellite ou séquence microsatellite est une séquence d ADN formée par une répétition continue de motifs composés de 2 à 10 nucléotides. Cette séquence est également appelée simple… …   Wikipédia en Français

  • Genetics and the Book of Mormon — The Book of Mormon, one of the four books of scripture of The Church of Jesus Christ of Latter day Saints (see Standard Works ), is an account of a three groups of people. Two of these groups originated from Israel. There is generally no support… …   Wikipedia

  • Topic outline of genetics — Genetics is the study of how living things receive common traits from previous generations. These traits are described by the genetic information carried by a molecule called DNA. The instructions for constructing and operating an organism are… …   Wikipedia

  • Outline of genetics — See also: Index of genetics articles Genetics is the study of how living things receive common traits from previous generations. These traits are described by the genetic information carried by a molecule called DNA. The instructions for… …   Wikipedia

  • List of genetics-related topics — This is a list of terms related to genetics. NOTOC # * 3 end * 5 end A * Acentric chromosome * Achondroplasia * Active site * Adam s Curse * Adaptation * Adenine * Adenosine * Adenovirus * Adenosine diphosphate (ADP) * Ala * Alagille syndrome *… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”