International HapMap Project

The International HapMap Project is an organization whose goal is to develop a haplotype map (HapMap) of the human genome, which will describe the common patterns of human genetic variation. The HapMap is expected to be a key resource for researchers to find genetic variants affecting health, disease and responses to drugs and environmental factors. The information produced by the project is made freely available to researchers around the world.

The International HapMap Project is a collaboration among researchers at academic centres, biomedical research groups and private companies in Canada, China, Japan, Nigeria, the United Kingdom, and the United States. It officially started with a meeting on October 27 to 29, 2002, and was expected to take about three years. It comprises two phases; the complete data obtained in Phase I were published on 27 October 2005. The analysis of the Phase II dataset was published in October 2007. The Phase III dataset was released in spring 2009.

Samples used

Haplotypes are generally shared between populations, but their frequency can differ widely. Four populations were selected for inclusion in the HapMap: 30 adult-and-both-parents trios from Ibadan, Nigeria (YRI), 30 trios of U.S. residents of northern and western European ancestry (CEU), 44 unrelated individuals from Tokyo, Japan (JPT) and 45 unrelated Han Chinese individuals from Beijing, China (CHB). Although, the haplotypes revealed from these populations should be useful for studying many other populations, parallel studies are currently examining the usefulness of including additional populations in the project.

All samples were collected through a community engagement process with appropriate informed consent. The community engagement process was designed to identify and attempt to respond to culturally specific concerns and give participating communities input into the informed consent and sample collection processes.

Scientific strategy

For Phase I, one common SNP was genotyped every 5,000 bases. Overall, more than one million SNPs were genotyped. The genotyping was carried out by 10 centres using five different genotyping technologies. Genotyping quality was assessed by using duplicate or related samples and by having periodic quality checks where centres had to genotype common sets of SNPs.

To obtain enough SNPs to create the Map, the Consortium had to fund a large re-sequencing project to discover millions of additional SNPs. These were submitted to the public dbSNP database. As a result, by August 2006, there were more than ten million SNPs in the database with more than 40% of them that were known to be polymorphic. By comparison, at the start of the project, fewer than three million SNPs were known and no more than 10% of them were known to be polymorphic.

During Phase II more than two million additional SNPs have been genotyped throughout the genome by the company Perlegen Sciences and 500,000 by the company Affymetrix.

Data access

All of the data generated by the project, including SNP frequencies, genotypes and haplotypes, were placed in the public domain and are available for download. This website also contains a genome browser which allows researchers to find SNPs in any region of interest, their allele frequencies and their association to nearby SNPs. A tool that can determine tag SNPs for a given region of interest is also provided. These data can also be directly accessed from the widely used Haploview program.

External links

International HapMap Project

From ISOGG Wiki

Contents

Samples used

Scientific strategy

Data access

External links