Page Actions

CentiMorgan

From ISOGG Wiki

This page is a translated version of the page CentiMorgan and the translation is 14% complete.

Outdated translations are marked like this.
Other languages:
English • ‎français

In genetic genealogy, a centiMorgan (cM) or map unit (m.u.) is a unit of recombinant frequency which is used to measure genetic distance. It is often used to imply distance along a chromosome, and takes into account how often recombination occurs in a region. A region with few cMs undergoes relatively less recombination. The number of base pairs to which it corresponds varies widely across the genome (different regions of a chromosome have different propensities towards crossover). One centiMorgan corresponds to about 1 million base pairs in humans on average. The centiMorgan is equal to a 1% chance that a marker at one genetic locus on a chromosome will be separated from a marker at a second locus due to crossing over in a single generation.

The genetic genealogy testing companies 23andMe, AncestryDNA, Family Tree DNA and MyHeritage DNA use centiMorgans to denote the size of matching DNA segments in autosomal DNA tests. Segments which share a large number of centiMorgans in common are more likely to be of significance and to indicate a common ancestor within a genealogical timeframe.

The centiMorgan was named in honor of geneticist Thomas Hunt Morgan by his student Alfred Henry Sturtevant. Note that the parent unit of the centiMorgan, the Morgan, is rarely used today.

23andMe and Family Tree DNA both use HapMap to infer their centiMorgans.

centiMorgans vs megabases

CentiMorgans are interpolated numbers that take into consideration each area of a chromosome and its propensity to recombine. This means if two cousins share 40 cM on chromosome 1, and two different cousins share 40 cM on chromosome 5, they both can be predicted to share a certain degree of relationship statistically. Megabases vary slightly in different locations so that in the same scenario, if both sets shared 40 Mb pairs, it would be more difficult to ensure they are of a similar degree of relation without further accounting for location, chromosome and other factors.[1]

Ann Turner provides a useful explanation: "I think of the cM as being a unit of 'effective' distance. As an analogy, a mile is a fixed quantity (5280 feet), and so are megabases. But the probability that a person can walk a mile in 20 minutes is more fluid. If the terrain is very rough, the "effective" distance of a literal mile might be more like two miles if you're trying to arrive at a certain time. We're more interested in the probability that a segment will be passed on intact than the size of the segment in Mb".[2]

As the cM is an empirical measure, based on recombination events in a particular dataset of parents and offspring, it can vary somewhat from study to study. This set of maps for each chromosome shows that the general shape of the centiMorgan vs megabase curve is similar for two datasets, but the absolute values are not quite the same:

http://web.archive.org/web/20070113005025/http://compgen.rutgers.edu/maps/compare.pdf

cm values per chromosome

The following table compares cM values per chromosome at Family Tree DNA, GEDmatch, and 23andMe. AncestryDNA uses 3475 as the total cM according to the help screen for confidence level in a DNA match. This presumably excludes the X chromosome.

CM chromosome FTDNA&GEDMatch&23andMe.jpg

Probability of crossover

The following chart shows the estimated probability that a segment will be affected by a crossover. The chart does not take into account some variables such as inversions and different recombination rates for males and females.

Crossover probability centiMorgans.png

Conversion des centiMorgans en pourcentages

Afin d'obtenir un pourcentage approximatif de partage de l'ADN à partir du test de Family Tree DNA Family Finder, prenez tous les segments de plus de 5 cM, les additionner et diviser ensuite par 68.

La façon dont le calcul fonctionne est que votre génome total en cMs avec le test Family Finder est de 6770 cM. Une moitié identique d'une correspondance (comme un parent/enfant) est de 3385 cM. Ce nombre doit être doublé pour représenter à la fois le coté maternel et paternel donnant un total de 6770 cM. Matt Dexter explique : "La raison pour laquelle le nombre n'est pas 6770 ou 6800, mais plutôt de 68 ans, est qu'il permet d'économiser une étape supplémentaire permettant de faire le calcul pour convertir une réponse en pourcentage. Par exemple, 3385 / 6770 = .5 puis dans une deuxième étape, .5 fois 100 = 50%. En utilisant 68 dès le début, cela économise l'ajout d'étapes de mathématiques. (3385 / 6800) * 100 est la même chose que 3385 / 68, ce qui donne comme résultat = 50%."[3]

Human reference genome

The centiMorgan totals per chromosome are based on the Human Reference Genome. 23andMe and Ancestry DNA use Build 37. Family Tree DNA use Build 37 for matching but Build 36 for segment boundaries in the Chromosome Browser. Raw data files are provided in both formats. Build 37 filled in quite a few gaps, and the number of base pairs in each of the chromosomes was longer in Build 37 as compared to Build 36. Consequently the cM totals per chromosome are lower for Family Finder than they are for 23andMe. GedMatch use Build 36, and convert AncestryDNA and 23andMe data from Build 37 to Build 36 for backward compatibility.

The latest version of the Human Reference Genome, Build 38, was released in December 2013. However, none of the companies have as yet adopted Build 38 and there is a “gentleman’s agreement” in place to stick with Build 37 for the present time.

Further reading

Références

<références />

  1. Matt Dexter. Megabases versus centiMorgans Message posted on the ISOGG Group Administrators' mailing list, 21 June 2014.
  2. Ann Turner. centiMorgans vs megabases. Message posted on the ISOGG Group Administrators' mailing list, 22 June 2014.
  3. Dexter Matt. Message posté sur la liste ADN pour DEBUTANT d'ISOGG dans le fil de discussion intitulé "Conférence sur l'ADN - Merci", 13 novembre 2013.