Y-DNA next generation sequencing
From ISOGG Wiki
A Y-sequence is a substantial part of the Y-chromosome DNA which, according to hg19, is 59,373,566 bases long. See the Y-DNA SNP testing chart for a technical comparison of Y-sequence testing and other forms of Y-SNP testing.
Contents
Comparison table (outdated!)
An updated comparison can be found at YDNA-Warehouse Testing Benchmarks.
This chart has been compiled as a subset of ChrisR/current_NextGenSeq_testing#NGS_comparison_table. FGC = Full Genomes Corporation, FTDNA = Family Tree DNA, see also Y-DNA SNP testing chart and Autosomal DNA testing comparison chart.
FGC WGS 30× |
FGC WGS GenomeGuide |
FGC WGS 4× |
FGC WGS 2× |
FGC Y-Elite 2.1 |
FTDNA BigY | |
---|---|---|---|---|---|---|
Introduced | Summer 2014 | Early 2016 | July 2015 | July 2015 | May 2015 | November 2013 |
Price | $1250 ($42/×) |
$700 ($47/×) |
$395 ($99/×) |
$280 ($140/×) |
$645 | $575 |
Sequenced DNA focus |
whole genome | whole genome | whole genome | whole genome | Y-DNA, mtDNA |
Y-DNA [1] |
Read depth, read length, Method |
30× 150 bp (or 10 mb)[2] |
~15× 150 bp |
4× 150 bp |
2× 150 bp |
80× 150 bp |
60× 100 bp |
Upgrade options | $55 per 1× + $100 data fee[3] | price difference + $100 data fee |
price difference + $100 data fee[4] |
price difference + $100 data fee[4] |
2nd order for 60× [5] |
- |
Y ≥1× coverage (FGC) |
~22.9 mbp [6] 92% hg19 |
~ 22.3 mbp? | ~17.7 mbp [6] 72% hg19 |
? mbp | ~22.0 mbp [6] 89% hg19 |
~16 mbp 65% hg19 (14-23 mbp)[7] |
Y Callable Loci (GATK) (FGC qual.-read-lenght) |
~14.9 mbp [8] | ~13.2 mbp hg19[9] | ~1.1 mbp hg19 | ~0.4 mbp hg19 | ~14.8 mbp hg19?[10] ~13.5 mbp hg38[11] |
~8.8 mbp hg19[12] ~8.6 mbp hg38[11] |
Y Method Analysis (YFull) |
Mean/Av. 21× Median 12× ~22.8 mbp ~0.3 Gb BAM ~2900? SNPs ?/111 STRs |
? | [13] Mean/Av. 9× Median 4× ~87% Y-cov-hg19 ~0.1 Gb BAM 2,764 known + 243 novel SNPs ~81/111 STRs |
? | Mean/Av. ~47-72× Median ~31-47× 22 mbp ~1.2 Gb BAM [6] ~2750 SNPs[14] ~107/111 STRs |
Mean/Av. -91× Median 47-60× ~13.9 mbp ~0.8 Gb BAM ~2050 SNPs ~96/111 STRs |
mt Method Analysis |
~100% FMS Mean/Av. >1000X [9] |
92-100% FMS [15] | ~69% FMS [1] Mean/Av. ~13-41X [16] (0-100%)[7] | |||
at/X Method ~3,60 mill. SNPs expected |
~3.60 mill. SNPs (~100%)[17] ca. 22.5x? Coverage ca. 95%. |
~3.52 mill. SNPs (~98%)[9] |
~1.75 mill. SNPs (~49%)[17] |
not included | not included |
FTDNA Big Y-700
Since 2013 Family Tree DNA provides VCF (and BED) for download on the Big Y results page > Download Raw Data. BAM files for Big Y deliver ca. 13.9 Mbp in approx. 1 GB. The BAM file needs to be requested by filling out a Contact Us Form request and choosing the Big Y BAM Request category. After some business days (in May 2016 it takes ca. 2-3 weeks) the link should be available Big Y results page > Download Raw Data > Download BAM. Additionally a "Share BAM" function is available which creates a temporarily valid link to download the BAM file (expires after 1 hour). Upload to YFull with the "Link to a sharing file" normally works.
YSEQ WGS/NGS 15x, 30x, 50x
See NGS Tests pages.
FGC Y-Elite (not in EU)
Since 2013 Full Genomes Corp. provides the raw data in BAM file format along with interpretation results. Download is possible after Account login or with Account Sharing from Amazon Simple Storage Service (S3).
- Y-Elite 1.0: BAM files contain ca. 22.7 Mbp in approx. 9 GB.
FGC WGS 30x (not in EU)
Since 2014 Full Genomes Corp. provides the Raw Data in BAM file format along with interpretation results. BAM files according to read depth (data amount) are delivered online or by hard drive / USB stick.
Other WGS 30x (Dante Labs, Nebula)
Those providers usually do not provide the extraction of the Y-chromosome BAM file and also do no advanced Y-DNA SNP/haplogroup and/or Y-STR analysis. So usually only experts able to use tools like WGS Extract and doing an upload to YFull etc. will use such services for Y-DNA genetic genealogy.
Historical Mentions
From 2007-2013 FTDNA Walk Through the Y delivered 300,000 - 600,000 base pairs. The technology used was not NextGenSeq but the details are mentioned here.
Y-Chr Sequencing/Mapping Insights
- Currently, no company offers true T2T (telomere 2 telomere) sequencing, which requires extensive nanopore and short-read sequencing at high coverage (166x for hs1).
- Some service providers try to make long-read nanopore sequencing available, but true T2T is still out of reach for most.
- Mapping normal WGS reads (FASTQ) to the hs1 reference is NOT T2T. This is just normal BWA mapping.
- Filtering for the Y yields a CP086569.2 BAM file, which can be analyzed.
- This may deliver novel SNPs in unstable regions, but has limitations:
- hs1 is from haplogroup J1, so closer for J1 tests than R1b.
- Fewer R1b reads may map, worse results than hg38.
- No official hs1 SNP list yet to identify known and novel SNPs.
- Retiring hg38 by all DTC service providers may avoid confusion if all agree to switch to hs1.
Regarding the 40+ T2T/HPRC models available since a study published in Summer 2023:
- hundreds of liftover chains needed between them.
- Direct conversion loses differing regions between haplogroups.
- Chaining multiple conversions better, but still information loss.
- No liftover chains created except to hs1.
- Ultimately de novo Y chromosome assemblies for best analysis are needed (like with Nanopore etc.)
- Possibly some intermediary solutions will work: choosing of master sequence (like HG002 v2.7), adding alternate contigs for other haplogroup branches, Align and call variants as normal, step to map alt contigs to branches in pangenome graph
This is a summary of YSEQ group @ Facebook posts by Thomas Krahn and Randy Harr
See also
- Next generation sequencing
- Full Genomes Corporation
- Family Tree DNA
- Y-DNA tools
- YFull
- FGC Full Y Chromosome Sequencing at anthrogenica.com
- Next Generation Sequencing Statistics GRCh38 by James Kane (haplogroup-r.org)
References
- ↑ 1.0 1.1 until April 2015 BAM file included mtDNA data, after it is omitted
- ↑ $2750 pilot project long read whole genome Chromium technology, Justin Loe, FGC, 2016-09-02, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=184229&viewfull=1#post184229
- ↑ Justin Loe, FGC, 2016-09-20, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=187847&viewfull=1#post187847
- ↑ 4.0 4.1 Justin Loe, FGC, 2015-11-29, Forum http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=123473&viewfull=1#post123473
- ↑ Justin Loe, FGC, 2015-12-15, http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=126856#post126856
- ↑ 6.0 6.1 6.2 6.3 Justin Loe, email message, 28 Dec 2015 and AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=131470&viewfull=1#post131470
- ↑ 7.0 7.1 Jim Kane, Which Y-DNA NGS test to take? November 12, 2015, http://www.it2kane.org/2015/11/which-ngs-test-to-take/
- ↑ Justin Loe, E-Mail 2016-06-10
- ↑ 9.0 9.1 9.2 Justin Loe, E-Mail 2016-02
- ↑ Justin Loe, AG Forum 2016-03 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=146390&viewfull=1#post146390
- ↑ 11.0 11.1 JamesKane, AG Forum 2016-03 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=146667&viewfull=1#post146667
- ↑ Vince Tilroe analysis of FGC raw-data from Greg Magoon, shared by Iain McDonald, 2 Dec 2015
- ↑ based on a single sample with initial QC problems YF05650
- ↑ Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=132860&viewfull=1#post132860
- ↑ Justin Loe, Batch 9006, email message, 28 Dec 2015
- ↑ Petr, Forum post: Full Y Chromosome Sequencing: Phase III Pilot, 2015-12-25, http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=128756&viewfull=1#post128756
- ↑ 17.0 17.1 Justin Loe, AG Forum 2016-01 http://www.anthrogenica.com/showthread.php?742-Full-Y-Chromosome-Sequencing-Phase-III-Pilot&p=134491&viewfull=1#post134491