Page Actions

Cousin statistics

From ISOGG Wiki

It is helpful to have an understanding of the probabilities of matching cousins at different degrees of relationship when contemplating autosomal DNA testing or when interpreting the results of an existing test. This article reports some relevant cousin statistics.

Company statistics

The following table shows the information provided by the three main testing companies on the probability that two cousins will share enough DNA for the relationship to be detected. AncestryDNA claim to be able to detect more relatives in the third to sixth cousin range which possibly relates to their use of phasing. Note that each company sets its own autosomal DNA match thresholds and some more distant matches might not appear because of these restrictions.

Relationship 23andMe[1] AncestryDNA[2] Family Tree DNA Family Finder[3]
First cousins 100% 100% 100%
Second cousins 100% 100% >99%
Third cousins 89.7% 98% >90%
Fourth cousins 45.9% 71% >50%
Fifth cousins 14.9% 32% >10%
Sixth cousins 4.1% 11% Remote (typically less than 2%)[4]
Seventh cousins 1.1 3.2%
Eighth cousins 0.24 0.91%
Ninth cousins 0.06%
Tenth cousins 0.002%

Theoretical probabilities

Amy Williams and colleagues at Cornell University have provided two useful chart based on simulations which shows the likelihood of matching with cousins of different degrees of relationship and the number of segments shared:

Simulations in a 1983 paper by Kevin Donnelly showed the theoretical probability of having no detectable DNA inherited from a specific ancestor and the probability of sharing DNA with cousins of varying degrees of relationship. The content of the following two tables is derived from Table 1 in the paper The probability that related individuals share some section of genome identical by descent by Kevin P Donnelly, Statistical Laboratory, Cambridge University, Cambridge, England. (Source: Theoretical Population Biology 1983: 23, 34-63) A copy of the paper is available here.

Cousin relationships.jpg

Ancestor relationships.jpg

SNP testing and relatedness

In their research presented in the February 2016 issue of Theoretical Population Biology[5], M. Sun, N.A. Sheehan, et al. conducted extensive simulations based on actual test results taken from modern microarray autosomal DNA tests and taking into account the effect of linkage disequilibrium. The authors examined the probability that any two people, via SNP testing, could be detected to have shared genetic ancestry versus the likelihood that they might be completely unrelated. For specificity, the models were run assuming there was a single MRCA, not a couple; i.e., that the ancestral lines descended from half-sibling children of the MRCA.

The authors organized the runs using four different densities of SNPs tested, with 500K being the closest value to our common microarray tests. In the table below the probability averages have been converted to percentages, and it is important to note that a value of 50% indicates at that level the ability to predict shared genetic ancestry as opposed to unrelatedness is indistinguishable per the simulation parameters:

No. of SNPs Siblings 1st Cousins 2nd Cousins 3rd Cousins 4th Cousins 5th Cousins
2,200 100% 92.5% 60.5% 51.5% 50% 50%
22,000 100% 100% 94.7% 68.5% 55% 54.7%
500K 100% 100% 100% 87.8% 61.2% 55.1%
1 million 100% 100% 100% 87.2% 64.7% 55.7%

One conclusion of the authors was, "as expected, identification of the true relationship from unrelated becomes more difficult as the relationship becomes more distant. We also confirmed...that it is much more challenging to distinguish between the true pedigree and a close alternative structure (rather than just 'unrelated') after about 6 separating meioses [2nd cousins] with 500K SNPs."[5] See also Table 3, Skare, Sheehan, and Egeland (2009)].[6]

How many cousins do we have?

Although there is only a low chance of sharing enough DNA with a specific distant cousin for the relationship to be detected, we have a large number of distant cousins and so many of these more distant cousins will appear in our match lists. The following table from the paper Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples by Henn et al (2012) shows the expected number of cousins at different degrees of relationship and the expected number of detectable cousins along with the expected amount of Identical by descent (IBD) sharing if the relationship is detected.

How many cousins.jpg

Mathematician and genetic genealogist Paul Rakow has done his own computer simulations on family sizes and has published the results in an essay on Counting cousins (published online 31 March 2016).

A study by AncestryDNA, based on British birth rates, census data, parliamentary research briefings[7] and other sources for the last 200 years, produced the following statistics on the number of cousins that the average British person would be expected to have.[8][9]

Relationship Number of cousins
First cousins 5
Second cousins 28
Third cousins 175
Fourth cousins 1,570
Fifth cousins 17,300
Sixth cousins 174,000

It is not clear if these statistics relate to the whole of the United Kingdom or just England and Wales.

Further reading

Tools

See also

Acknowledgements

Thanks to Jon Hamm for compiling the spreadsheets based on data from Donnelly (1983) that have been used in the section on theoretical probabilities.

Footnotes

  1. The 23andMe data were extracted from the paper by Henn et al. "Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples" published online in PLOS One on 3 April 2012. 23andMe has a simplified version of these data in a customer FAQ "The probability of detecting different types of cousins".
  2. The AncestryDNA data were extracted from information provided in the first "AncestryDNA Matching White Paper," 2014, in section 5.3 "Relationship Estimation Evaluation." These calculations used small segment sizes but as of 31 August 2020 AncestryDNA has stopped reporting on matches sharing less than 8 cM total.
  3. The Family Finder percentages are from an article in the Family Tree DNA Learning Center "What is the probability that my relative and I share enough DNA to be detected by Family Finder?"
  4. The figure quoted by Family Tree DNA refers to "6th cousins and more distant".
  5. 5.0 5.1 M. Sun, N.A. Sheehan, et al., "On the Use of Dense SNP Marker Data for the Identification of Distant Relative Pairs." Theoretical Population Biology 107 (February 2016): 14-25. DOI: https://doi.org/10.1016/j.tpb.2015.10.002
  6. Øivind Skare, Nuala Sheehan, Thore Egeland, "Identification of Distant Family Relationships." Bioinformatics 25:18 (September 2009) 2376–2382. DOI: https://doi.org/10.1093/bioinformatics/btp418
  7. A Century of Change: Trends in UK statistics since 1900 House of Commons Research Paper 99/111. Published 21 December 1999.
  8. Average British person has 193,000 living cousins says new research. Daily Mirror. 17 June 2015.
  9. Ancestry. There's a one in 300 chance that a complete stranger is your cousin. Ancestry blog, 26 June 2015. Also available as a press release on the Ancestry Corporate website: One in 300 chance that a complete stranger is your cousin.