Y-DNA tools
From ISOGG Wiki
There are a number of Y-DNA tools that can be used to analyse Y-chromosome DNA results including predicton of haplogroups.
Note that this list is provided for information only. Inclusion on this list does not imply recommendation or endorsement by ISOGG.
- 1 Y-STR haplotype tools (modals, genetic distance, TMRCA)
- 2 Y-STR haplogroup prediction tools
- 3 Y-SNP haplogroup prediction tools
- 4 YBrowse - The ISOGG human Y-chromosome browser
- 5 Y-STR and Y-SNP databases
- 6 General Y-DNA tools
- 7 Next generation sequencing tools
- 8 Y-Sequence Trees (TMRCA, ancient Y)
- 9 Further reading
- 10 Scientific papers
- 11 See also
- 12 References
Y-STR haplotype tools (modals, genetic distance, TMRCA)
The following is a selection of tools that can be used for a set of Y-STR values (DYS etc.) including calculating the time to the most recent common ancestor.
- mitoYDNA.org A crowdsourced Y-DNA and mtDNA database, with tools and analysis
- MODIFIED Y-Utility by Colin Ferguson based on Dean McGee's Y-DNA Utility, New: TMRCA 5-95%, inverse mutation rate weighting, Marko Heinila's 2012 mutation rates, modified genetic distance color codes, Changed Hybrid GD calculations, choice of the maximum-likelihood estimate, choice of Iain McDonald's mutation rates, Implementation of Nordtvedt's variance method [Updated: 2011-2018]
- STR Match Finder by PhyloGeographer.com (Hunter Provyn): allows to find STR matches within public results from various sources (FTDNA, YSEQ, YFull).
- Dean McGee's Y-DNA Utility Ysearch/FTNDA Mode, also for up to Y-DNA111 markers for Genetic Distance and TMRCA tables. It can output to PHYLIP and Fluxus for building phylogenetic trees, etc. [Last Modified: 2008]
- TiP - the Family Tree DNA Time Predictor tool (only for FTDNA Y-DNA12+ customers)
- The Clan McDonald TMRCA Probability Calculator Graph or List
- Moses Walker's Most Recent Common Ancestor Probability Calculator Probability of Relationship %, Generations to MRCA
- Phylofriend Open source program that calculates genetic distances from Family Tree DNA projects to create phylogenetic trees with PHYLIP.
- Y-DNA Family Grouping App A free app created by Chase Ashley that allows people to quickly and easily (1) analyze surname project STR results to determine appropriate family groupings and (2) compare a particular project kit against all other kits in the project
- SAPP (Still Another Phylogeny Program). Web based interface. Takes a plain text data file (*.txt) as input. which contains the STR results, SNPs, and Genealogies for SAPP to use in building the phylogenetic tree. See the blog post from Maurice Gleeson L2 MHT - "difficult to place" people for a description of its use.
- Tim Janzen in 2008 has developed a program in Excel that does intraclade TMRCA estimates for large numbers of Y STR haplotypes. It may be downloaded from http://www.timjanzen.com/dna.html.
- Ken Nordtvedt has a program in Excel that does intraclade and interclade TMRCA estimates for Y STR haplotypes (Generations111T.xlsx only available from web.archive)
Y-STR haplogroup prediction tools
The accuracy of haplogroup prediction based on Y-STR haplotypes depends mainly on the number of STR values. Without SNP confirmation prediction for low-resolution haplotypes (Y-STR12 to Y-STR25) has a low value of confidence and convergence can be a problem. For many haplogroups Y-STR25 to Y-STR37 has an acceptable confidence level while for some young haplogroups with rapid diversification and expansion like in R1b even Y-STR111 is not enough to discriminate the correct sub-lineage with confidence. A few scientific studies are available.[1] [2] [3] However, with the growth of next generation sequencing tests and whole genome sequencing which report on both STRs and SNPs in a single test, the use of STR-based tests and the need for haplogroup prediction tools is in decline.
- mitoYDNA.org A crowdsourced Y-DNA and mtDNA database, with tools and analysis
- Nevgen Y-DNA haplogroup predictor. Milos Cetkovic Gentula and Aco Nevski, 2015
- Whit Athey's haplogroup predictor (2004-2012), used by scientific papers
- Jim Cullen's haplogroup predictor, v1.2 2008, used by scientific papers
- Arizona University's Haplogroup classifier, 2008, used by scientific papers
- Stephen P. Morse's Predicting haplogroups in one-step, allows to fetch STR-values from Ysearch or FTDNA (kit login), 2008
- Y Haplogroup Predictor (YHP) The software is described in a preprint by Mengyuan Song, Feng Song, Chenxi Zhao, and Yiping Hou YHP: Y-chromosome Haplogroup Predictor for predicting male lineages based on Y-STRs, published online 12 January 2021
- Felix Immanuel's Y-Haplogroup Predictor, and Y-HaploGroup Population Browser, 2013
- Haplogroup specific:
- I: Terry Robb's haplogroup I1 and I2 subclade predictor (for 67 marker results only)
- R1b: Robert Casey's L21 Y67 subclade predictor, R-L21 Files Alex Williamson & Mike Walsh
Y-SNP haplogroup prediction tools
- YSEQ Clade finder A tool launched in 2020 developed by Hunter Provyn with input and support from Thomas Krahn. Provides a haplogroup assignment based on the YFull tree. Can accept manual entry of a list of SNPs, raw data uploads from 23andMe, AncestryDNA and MyHeritage, and can also use VCF (Variant Call File) data from NGS tests.
- Chris Morley's Y-SNP subclade predictor, optimized for Geno 2.0 data transferred to FTDNA, and also the Full Genomes demo data. See also ytree.morleydna.com.
- Felix Immanuel's 23andMe to Y-SNP Converter, 2014
- Steven Frank's AncestryDNA Reformat VBS script, 2016
- Y-Haplo A free tool provided by 23andMe for non-commercial use. The tool is designed for use with large datasets. The methodology is explained in a preprint by David Poznik, "Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men." BioRxiv, published online on 19 November 2016.
- Y-leaf - a tool for performing haplogroup predictions from next generation sequencing data (BAM or FASTQ files). For details see the paper by Ralf et al (2018) Yleaf: Software for human Y-chromosomal haplogroup inference from next-generation sequencing data.
- Y-LineageTracker. A free tool for analysing Y-chromosome data generated from next generation sequencing data. The tool is described in the 2021 paper by Chen, Lu, Lu an Xu YLineageTracker: a high-throughput analysis framework for Y-chromosomal next-generation sequencing data. BMC Bioinformatics volume 22, Article number: 114 (2021).
- Y-SNP Haplogroup Hierarchy Finder A tool for predicting haplogroups from Y-chromosome sequencing data based on the ISOGG Y-SNP tree. The tool is described in the paper by Tseng et al (2022) Y-SNP Haplogroup Hierarchy Finder: a web tool for Y-SNP haplogroup assignment.
YBrowse - The ISOGG human Y-chromosome browser
Maintained by Thomas Krahn of YSEQ, ISOGG YBrowse is a free resource for detailed data about Y chromosome STRs and SNPs. Currently using reference assembly hg38, users can enter the name of an individual SNP, STR, or a loci range and the interactive display will show elements such as a graphic of the relative position on the chromosome, the actual locus number(s), the range covered by an STR (and SNPs encompassed, if any), synonymous SNP names, and SNP ancestral allele. Additionally, YBrowse makes available for download the datasets it is currently using. These are updated frequently and include files in FASTA, GFF (Generic Feature Format), VCF (Variant Call Format), and CSV versions.
Y-STR and Y-SNP databases
- mitoYDNA.org A crowdsourced Y-DNA and mtDNA database, with tools and analysis
- YHRD.org, World Y-database with over 160,000 haplotypes, 1015 populations, 33 metapopulations, Sascha Willuweit & Lutz Roewer, 2000-2016
- Yfiler haplotype database
General Y-DNA tools
- FamilyTreeDNA Discover A free website from FamilyTreeDNA where can learn more about your haplogroup.
- Ancestral DNA Marker Pedigree Display A tool from Brad Larkin which allows the user to display the SNP progression for a single ancestral line
- Genetic Homeland DNA Marker Index A free resource from Brad Larkin which provides an index of markers and their RSID positions on all 23 chromosomes.
- Rob Spencer's SNP Tracker Provides a SNP progression list, timelines and maps based on FTDNA BigY data
- FT2DNA Dave Hamm's utility to convert Family Tree DNA's Y-chromosome repeat data format into raw code (i.e.: ATGC) for analysis by other utilities such as PHYLIP.
- FT2PHY Dave Hamm's tool for the conversion of FTDNA repeat data format files into ATGC format files for use with PHYLIP compatible packages, such as DNAML. This program uses input that is compatible with Dean McGee's Y-DNA Utility, but is limited to 37 markers.
- Ann Turner's Mutation Calculator A tool to calculate Y DNA mutations.
- YGED Roger Arrick's program simplifies the creation of GEDCOM files for the Y-DNA line
- Y Heatmap A Y-chromosome mapping tool developed by Hunter Provyn in cooperation with YSEQ and YFull
- Time to most-recent common ancestor (TMRCA) estimates for targetted pairs A tool provided by Iain Macdonald which will perform comparisons based on Y-chromosome sequencing tests and 111 Y-STR tests.
Next generation sequencing tools
- YFull A raw data (BAM) interpretation and analysis service (online) for full Y-chromosome data
- Full Genomes Corporation Offers also a BAM file analysis service (Big Y etc.)
- clarifY DNA A Big Y analysis service providing personalised reports and diagrams showing placement on the phylogenetic tree. See the review by Debbie Kennett clarifY DNA - a new Y-SNP analysis service
- Brad Larkin's NextGenMatch tool. The tool is available on Brad's Genetic Homeland website. The tool is free to use but registration is required. For details see Brad's post on the GenealogyDNA mailing list
- Felix Immanuel's Big Y BAM analysis tool
- Integrative Genomics Viewer (IGV) A tool provided the Broad Institute which allows the user to analyse BAM files, needs registration with email
- Galaxy Open source, web-based platform for data intensive NGS research, included are SAMtools, BamTools, Picard, VCF Manipulation, etc., User account suggested
- SNPdata A free tool for comparing build 37/19 and build 38 Y-chromosome data. it contains a search for dual SNP positions
- WGS Extract A locally-installed desktop combined utility that allows extraction of information from common 30X coverage whole genome sequencing results. Three options are available for Y-DNA processing: 1) Generate a Y- and Mitochondrial-only BAM (for use at YFull); 2) Generate a Y-only BAM (for use at Y-DNA Warehouse or other sites that only need a Y BAM); 3) Generate an annotated Y VCF file for use at YFull in lieu of a BAM, or with Clade Finder or similar. It can be used on BAM/CRAM files from common testing providers like Dante Labs, Nebula Genomics, YSEQ, Full Genomes Corporation, and others.
- BAMsAway A Chrome extension for analysis of BigY BAM files. If you enter the Y-DNA position, the extension will show you the actual reads at that position.
Y-Sequence Trees (TMRCA, ancient Y)
- YFull YTree
- FamilyTreeDNA Discover: Y-DNA Time Tree, Classic Haplotree View
- TheYtree
- YDNA-Warehouse
Further reading
- Big Y-700 White Paper, by Caleb Davis, Michael Sager, Göran Runfeldt, Elliott Greenspan, Arjan Bormans, Bennett Greenspan, and Connie Bormans (2019)
- Testing and analysing Big Y - a primer, by Dennis Wright
Scientific papers
- McDonald, Iain. "Improved Models of Coalescence Ages of Y-DNA Haplogroups." Genes 12, no. 6 (June 2021): 862. https://doi.org/10.3390/genes12060862.
- Petrejcíková E, Carnogurská J, Hronská D et al (2014). Y-SNP analysis versus Y-haplogroup predictor in the Slovak population Anthropol Anz 2014;71(3):275-85.
- Schlecht J, Kaplan ME, Barnard K, Karafet T, Hammer MF, Merchant NC (2008). Machine-learning approaches for classifying haplogroup from Y chromosome STR data. PLOS Computational Biology. Published online 13 June 2008.
See also
- Convergence
- Phylogeny programs
- Generation length
- Most recent common ancestor
- Mutation rates
- SNP testing
- Y-DNA SNP testing chart
- Y-DNA STR testing chart
- Autosomal DNA tools
- Raw DNA data tools
- mtDNA tools
- DNA databases
- ↑ Wang et al 2013 (preprint): Convergence of Y chromosome STR haplotypes from different SNP haplogroups compromises accuracy of haplogroup prediction, http://arxiv.org/abs/1310.5413
- ↑ Larmuseau et al 2014: Recent Radiation within Y-chromosomal Haplogroup R-M269 Resulted in High Y-STR Haplotype Resemblance, http://dx.doi.org/10.1111/ahg.12050
- ↑ Woźniak et al 2006: Correlations between haplogroup membership and Y-STR haplotype as a potential measure of quality control in forensic examinations [in Polish], http://www.ncbi.nlm.nih.gov/pubmed/17131759