- International HapMap Project
The International HapMap Project is an organization whose goal is to develop a
haplotype map of thehuman genome (the HapMap), which will describe the common patterns of human genetic variation. The HapMap is expected to be a key resource for researchers to find genetic variants affecting health, disease and responses to drugs and environmental factors. The information produced by the project is made freely available to researchers around the world.The International HapMap Project is a collaboration among researchers at academic centers, non-profit biomedical research groups and private companies in
Canada ,China ,Japan ,Nigeria , theUnited Kingdom , and theUnited States . It officially started with a meeting on October 27 to 29, 2002, and was expected to take about three years. It comprises two phases; the complete data obtained of Phase I were published onOctober 27 ,2005 . The analysis of the Phase II dataset was published in October 2007.Background
Unlike with the rarer
Mendelian diseases, combinations of differentgenes and the environment play a role in the development and progression of common diseases (such asdiabetes ,cancer ,heart disease ,stroke , depression orasthma ), or in the individual response topharmacological agents. To find the genetic factors involved in these diseases, one could in principle obtain the complete genetic sequence of several individuals, some with the disease and some without, and then search for differences between the two sets of genomes. This approach is currently infeasible because of the cost of full genome sequencing. The HapMap project proposes a shortcut.Although any two unrelated people share about 99.5% of their
DNA sequence, some people may have an A at a particular site on a chromosome while others have a G instead. Such a site is known as asingle nucleotide polymorphism (SNP), and each of the two possibilities is called anallele . The HapMap project focuses only on common SNPs, those where each allele occurs in at least 1% of the population.Each person has two copies of all
chromosomes , except thesex chromosomes . For each SNP, the combination of alleles a person has is called agenotype .Genotyping refers to uncovering what genotype a person has at a particular site. The HapMap project chose a sample of 269 individuals and selected several million well-defined SNPs, genotyped the individuals for these SNPs, and published the results.The alleles of nearby SNPs on a single chromosome are correlated. This means that if the allele of one SNP for a given individual is known, the alleles of nearby SNPs can often be predicted. This is because each SNP arose in evolutionary history as a single
mutation , and was then passed down to descendents surrounded by other, earlier, mutations. SNPs that are separated by a large distance are typically not very well correlated, because recombination occurs in each generation, mixing the allele sequences of the two chromosomes. A sequence of consecutive alleles on a particular chromosome is known as ahaplotype .To find the genetic factors involved in a particular disease, one can proceed as follows. First a certain region of interest in the genome is identified, possibly from earlier inheritance studies. In this region one then locates a set of "tag SNPs" from the HapMap data; these are SNPs that are very well correlated with all the other SNPs in the region, so that knowledge of the alleles of the tag SNPs in an individual will determine the individual's haplotype with high probability. Next, one determines the genotype for these tag SNPs in several individuals, some with the disease and some without. By comparing the two groups, one can then determine the likely locations and haplotypes that are involved in the disease.
Samples used
Haplotypes are generally shared between populations, but their frequency can differ widely. Four populations were selected for inclusion in the HapNigeria (YRI), 30 trios of U.S. residents of northern and western European ancestry (CEU), 44 unrelated individuals fromTokyo ,Japan (JPT) and 45 unrelatedHan Chinese individuals fromBeijing ,China (CHB). Although, the haplotypes revealed from these populations should be useful for studying many other populations, parallel studies are currently examining the usefulness of including additional populations in the project.All samples were collected through a community engagement process with appropriate informed consent. The community engagement process was designed to identify and attempt to respond to culturally specific concerns and give participating communities input into the informed consent and sample collection processes.
Scientific strategy
For the Phase I, one common SNP was genotyped every 5,000 bases. Overall, more than one million SNPs were genotyped. The genotyping was carried out by 10 centres using five different genotyping technologies. Genotyping quality was assessed by using duplicate or related samples and by having periodic quality checks where centres had to genotype common sets of SNPs.
The Canadian team was led by
Thomas J. Hudson atMcGill University inMontreal and focused on chromosomes 2 and 4p. The Chinese team was led byHuanming Yang with centres inBeijing ,Shanghai andHong Kong and focused on chromosomes 3, 8p and 21. The Japanese team was led byYusuke Nakamura at theUniversity of Tokyo and focused on chromosomes 5, 11, 14, 15, 16, 17 and 19. The British team was led byDavid R. Bentley at theSanger Institute and focused on chromosomes 1, 6, 10, 13 and 20. There were four American genotyping centres: a team led byMark Chee andArnold Oliphant located at Illumina Inc. inSan Diego (chromosomes 8q, 9, 18q, 22 and X), a team led byDavid Altshuler at theBroad Institute in Cambridge, USA (chromosomes 4q, 7q, 18p, Y andmitochondrion ), a team led byRichard A. Gibbs at theBaylor College of Medicine inHouston (chromosome 12) and a team led byPui-Yan Kwok at theUniversity of California, San Francisco (chromosome 7p).To obtain enough SNPs to create the Map, the Consortium had to fund a large re-sequencing project to discover millions of additional SNPs. These were submitted to the public
dbSNP database. As a result, by August 2006, there were more than ten million SNPs in the database with more than 40% of them that were known to be polymorphic. By comparison, at the start of the project, less than 3 million SNPs were known and no more than 10% of them were known to be polymorphic.During Phase II more than two million additional SNPs have been genotyped throughout the genome by the company
Perlegen Sciences and 500,000 by the companyAffymetrix .Data access
All of the data generated by the project, including SNP frequencies,
genotypes andhaplotypes , were placed in the public domain and can be downloaded from [http://www.hapmap.org/ http://www.hapmap.org] . This website also contains a genome browser which allows to find SNPs in any region of interest, their allele frequencies and their association to nearby SNPs. A tool that can determine tag SNPs for a given region of interest is also provided.Related publications
* International HapMap Consortium. (2003) [http://www.hapmap.org/downloads/nature02168.pdf The International HapMap Project.] "Nature" 426(6968):789-96.
* International HapMap Consortium. (2004) [http://www.hapmap.org/downloads/HapMapEthics.pdf Integrating ethics and science in the International HapMap Project.] "Nat Rev Genet." 5(6):467-75.
* International HapMap Consortium. (2005) [http://www.hapmap.org/downloads/presentations/Nature_HapMap_phaseI.pdf A haplotype map of the human genome.] "Nature" 437(7063):1299-320.
* International HapMap Consortium. (2007) [http://hapmap.org/downloads/presentations/nature_hapmap3.pdf A second generation human haplotype map of over 3.1 million SNPs.] "Nature" 449(7164):851-861.
* Deloukas P, Bentley D. (2004) [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=14676823 The HapMap project and its application to genetic studies of drug response.] "Pharmacogenomics J." 4(2):88-90.
* Secko, David [http://www.the-scientist.com/news/20051026/01 Phase I of the HapMap Complete] The Scientist (October, 2005)
* Thorisson GA, Smith AV, Krishnan L, Stein LD. (2005) [http://www.hapmap.org/downloads/presentations/Genome_research_art10.pdf The International HapMap Project Web site.] "Genome Res." 15(11):1592-3.
* Terwilliger JD and Hiekkalinna T (2006). [http://linkage.cpmc.columbia.edu/2006/EJHG_Terwilliger_2006.pdf An utter refutation of the 'Fundamental Theorem of the HapMap'] "European Journal of Human Genetics 14, 426–437"ee also
*
Genealogical DNA test
*Tag SNP
*The 1000 Genomes Project External links
* [http://www.hapmap.org/ International HapMap Project (HapMap Homepage)]
* [http://www.genome.gov/10001688 National Human Genome Research Institute (NHGRI) HapMap Page]
* [http://www.cshprotocols.org/cgi/content/full/2008/8/pdb.prot5023 Browsing HapMap Data Using the Genome Browser]
Wikimedia Foundation. 2010.