Autosomal DNA match thresholds
From ISOGG Wiki
Each company sets their own autosomal DNA match thresholds. These criteria must be met before the company will report that two individuals very likely inherited their half-identical matching segments from a recent common ancestor.
In general a long consecutive string of half-identical SNP results (typically about 7 centiMorgans / 700 SNPs, depending on the test's error rate and other factors) is required before one can infer that two matching DNA segments are probably identical by descent though additional analysis, usually based on haplotype frequency, is required to determine whether the segment is of genealogical relevance.
IBD detection is improved by phasing – assigning the alleles to the maternal and paternal chromosomes. Without phasing there is an increase in the number of false positive matches, particularly for smaller segments under 15 cMs in size. AncestryDNA is currently the only company which uses phasing.
For half-identical regions the autosomal threshold is 7 cMs and at least 700 SNPs for the first segment; 5 cMs and 700 SNPs for additional segments and for people you are sharing with.
The maximum amount of error tolerated in a half-IBD segment would be roughly 1 opposite homozygote per 300 SNPs; furthermore, each such opposite homozygote in a half-IBD segment must be separated by roughly 300 SNPs (i.e., the length of a sub-segment). The criteria roughly correspond to allowing an error rate of 1%.
For fully identical regions the threshold is 5 cMs and 500 SNPs.
In addition, 23andMe has set a cap of 2000 matches in DNA Relatives. The cap is still set at 1000 matches for people who have not yet been transitioned to the new 23andMe website. For most Americans this threshold is excluding many valid matches. You can overcome the threshold to a certain extent by sending an introduction to your matches and/or inviting them to share genomes at the basic level. For people with Colonial American ancestry the threshold is excluding new matches below about 17 cMs. Without this cap Colonial Americans would probably have between 3000 and 5000 matches.
23andMe makes adjustments to downweight matches for customers with Ashkenazi ancestry.
Criteria for X-chromosome matches
For half-identical regions the thresholds are:
- X (male vs male): 200 SNPs, 1 cM
- X (male vs female): 600 SNPs, 6 cM
- X (female vs female): 1200 SNPs, 6 cM
For fully identical regions, the threshold is 5 cMs and 500 SNPs.
Family Tree DNA
The Family Finder match thresholds were updated on 26 May 2016. A match is declared if two people share a segment of 9 cM or more, regardless of the number of total shared cM. However, if there’s not a block that’s 9 cM or greater, the minimum of 20 shared cM with a longest block of 7.69 cM applies.
FTDNA also altered some proprietary portions of the matching algorithm that affect block sizes and total shared cMs. The existing proprietary formula is still applied to matches for people with Ashkenazi heritage. The X-chromosome match thresholds were not affected by this change.
Prior to this date the amount of shared DNA required for two people to show as a match was a minimum of 20 total cMs with a minimum longest block of at least 7.69 cM for 99% of testers, and 5.5 cM for the other one per cent. The matching algorithms were modified with effect from 21st April 2011 to downweight matches between Ashkenazi Jews in order to provide more accurate relationship predictions.
Criteria for X-chromosome matches
- 1 cM and 500 SNPs for both males and females; matches must already meet the autosomal DNA matching criteria
AncestryDNA introduced a new matching system in November 2014 which was improved and updated in May 2016.
Ancestry phase the data before doing the matching process using a proprietary phasing engine known as Underdog. They also use a proprietary algorithm known as Timber to filter out high-frequency IBD haplotypes which are not indicative of recent common ancestry. The minimum threshold for a match is set at 6 cMs of IBD sharing across the genome.
The following blog posts from AncestryDNA provide further details of the matching process:
- DNA matching just got better. Ancestry blog. 19 November 2014.
- AncestryDNA’s cutting edge science gets even sharper AncestryDNA help document, published on 3 May 2016
- The science behind a more precise DNA matching algorithm by Anna Swayne, Ancestry Tech Roots blog, 3 May 2016. This article includes a chart showing the cM range for the predicted relationships.
- Filtering DNA matches at AncestryDNA with Timber by Anna Swayne. Ancestry blog, 8 June 2015. This article provides an overview of AncestryDNA's Timber filtering algorithms. It also includes a chart showing the size in centiMorgans of the segments that are filtered out by Timber. While Timber removes a large percentage of smaller segments between 5 and 15 cMs it is notable that some larger segments up to 40 cMs are also removed by the filtering process.
For technical details see the AncestryDNA White Papers:
Detailed FAQs can be viewed by AncestryDNA testees via the help menu.
At the end of 2015 AncestryDNA introduced a new feature called Amount of Shared DNA which allowed customers to see the number of shared DNA segments and the number of shared centiMorgans. The methodology is explained in a blot post by Anna Swayne Behind the new AncestryDNA feature: amount of shared DNA published on 6 January 2016.
AncestryDNA assigns confidence levels depending on the approximate amount of shared centiMorgans. The guidelines in the table below are included in the AncestryDNA Matching White Paper. These guidelines are based on phased haplotypes and will not necessarily apply to matches at 23andMe and Family Tree DNA where haplotypes are not phased prior to performing the matching process.
|Confidence score||Approximate amount of shared centiMorgans||Likelihood of a single recent common ancestor|
|Extremely high||More than 60 cMs||Virtually 100%|
|Very high||45-60 cMs||About 99%|
|High||30-45 cMs||About 95%|
|Good||16-30 cMs||About 50%|
The full chart which includes descriptions of the relationships can be found in the FAQ “What does the match confidence score mean” in the AncestryDNA help menu.
Prior to November 2014 AncestryDNA set their threshold for matches at 5 megabases. In around January 2014 they subsequently changed to using centiMorgans and the threshold was changed to 5 cM, but the earlier matches were not rerun. The previous thresholds for other relationships at AncestryDNA are given here.
GedMatch is a free third-party tool which allows users to search for and compare matches with people who have tested with different companies. The site accepts raw data from 23andMe, AncestryDNA and Family Tree DNA's Family Finder test. GedMatch also allows the user to perform autosomal and X-chromosome one-to-one comparisons. Using the one-to-many tool GedMatch returns a list of your top 2000 matches. By default the threshold is set at 7 cMs. Comparisons between two kits can be done by using the one-to-one comparison tool which allows the user to set their own thresholds though the default is set at 7 cMs/700 SNPs.
DNA.Land is a free tool provided by scientists at the New York Genome Center. Users can contribute their autosomal DNA results to research while at the same time benefiting from the free ancestry reports. The Relative Finder feature uses statistical algorithms to detect IBD segments and to divide them into recent and ancient IBD. The methodology is adapted from ERSA (estimation of recent shared ancestry), a technique developed by Huff et al.
- Janzen T. DNA Relatives list. Message posted to the ISOGG Project Administrators mailing list (closed group), 6 January 2015.
- Are you categorizing cousins differently for people with Ashkenazi ancestry? 23andMe FAQ, accessed 7 January 2015.
- Bettinger B. Family Tree DNA updates match thresholds. The Genetic Genealogist, 24 May 2016.
- .Gleeson M. FTDNA update Family Finder. Gleeson Clan Gathering 2016, 25 May 2016.
- Canada RA. How does the nature of Jewish genealogy make autosomal DNA research more challenging. FTDNA Learning Center, 1 January 2014.
- Huff CD, Witherspoon DJ, Simonson TS (2011). Maximum-likelihood estimation of recent shared ancestry (ERSA). Genome Research 21(5): 768-774.