Raw DNA data tools
From ISOGG Wiki
Most of the major consumer genetic testing companies allow customers to download their raw data files. These files contain the “letters” (nucleotides A, C, G, T) that comprise DNA. The raw data can be uploaded to a variety of different services for free and/or paid analyses. The list below provides information on services which are of particular interest for genetic genealogy or which have been used by genetic genealogists. It is not intended to be a comprehensive list and is provided for information only. Inclusion on this list does not imply recommendation or endorsement by ISOGG.
|web tools (not desktop tools)||
website/page with no (desktop) software installation needed
can do analysis of a person's DNA against DNA from other relatives or ancestral populations
Genealogical web tools
- Borland Genetics Web-Tools Kevin Borland's database focused on DNA of deceased individuals, linked to a set of DNA reconstruction tools for creating kits representing deceased ancestors
- David Pike's Tools Utilities for analysing raw DNA data.
- DNAGedcom A website which provides a range of online and offline tools for analysing DNA data. The basic service is free. There is a monthly subscription fee for the premium service.
- DNA.Land A not-for-profit community website run by academics affiliated with Columbia University and the New York Genome Center. The site offers a biogeographical analysis, imputation and a relative-matching feature.
- GEDmatch A free utility to compare autosomal DNA data files from all three testing companies and to compare Gedcom files. A number of other very useful tools are also provided, some for a fee.
- Gene Heritage A service using raw DNA data to report on genes, traits, and ancient origins. Traits covered include eye color, lactose intolerance, alcohol flush reaction, and taste and smell sensitivity. Generates inheritance trees to show which genes were passed down from grandparents and parents to a child; identifies whether these genes originated in Europe, Asia, Eurasia, or Africa. Its Grandchild Report calculates what percentage of DNA a child inherited from each grandparent.
- GENOtation A set of online tools from Stanford University for analysing your personal genomic data. For further explanation see this blog post by Daniel MacArthur.
- lineage A tool provided by Andrew Riha for analysing raw data files. There are options to merge raw data files from different DNA testing companies, compute centiMorgans of shared DNA between individuals using HapMap tables, plot shared DNA between individuals, determine genes shared between individuals, find discordant SNPs between child and parent(s), and remap SNPs between assemblies / builds (e.g., convert SNPs from build 36 to build 37, etc.).
- Wegene.com A Chinese company which provides an analysis of 23andMe results.
Non-genealogical web tools
Most of these tools are health related.
- Codegen A free comprehensive health report using 23andMe raw data.
- GeneKnot A site which allows the user to upload genome data and compare DNA with other people with similar disease risks.
- Genetic Homeland Marker Index A free resource from Brad Larkin which provides an index of markers and their RSID positions on all 23 chromosomes.
- HIrisPlex-S Eye, Hair and Skin Colour DNA Phenotyping Webtool
- Impute.me A not-for-profit service run by Danish geneticists. Provides imputation combined with extensive trait analysis based on polygenic risk scores for all common diseases. The site also provides an ethnicity calculator, height and hair colour predictors and a UK Biobank calculator.
- Infinome A citizen science project that features a comprehensive interpretation engine for 23andMe data.
- NCBI Genome remapping service
- OpenSNP OpenSNP allows customers of direct-to-customer genetic tests to publish their test results, find others with similar genetic variations, learn more about their results, find the latest primary literature on their variations and help scientists to find new associations.
- Oxford Statistics Phasing Server A free utility to phase whole genomes based on VCF files.
- Promethease Accepts data from any genetic genealogy company and will generate health and trait reports based on current literature.
- Borland Genetics Desktop Toolkit Kevin Borland's free toolkit for reconstructing GedMatch-compatible synthetic DNA kits for deceased ancestors using raw DNA files of living relatives.
- DNA Kit Studio A tool to convert and analyze raw DNA data from several DTC DNA companies. For further details see the blog post Convert 23andme V5 RAW to Gedmatch classic and other companies valid format.
- dnamatch-tools. Python scripts "for working with various raw DNA files for genetic genealogy".
- Extracting a 23andMe V3 file from a whole genome BAM file Thomas Krahn has developed a script to generate a 23andMe file from a whole genome sequence to allow the user to upload the file to GEDMatch and other third-party tools.
- Golden Helix Genome Browser For details see Analyze your 23andMe genotype files with Golden Helix by Gabe Rudy, @gabeinformatics blog, 22 July 2015.
- Reich Lab software A range of tools is available from the Reich Lab. These programmes are likely to be of interest to advanced users.
Comparison by Raw DNA Data Sources
DNA testing services provide raw DNA data in different file formats.
|DNA Testing Service|
|Raw DNA data tool (compatible with...)||23andMe||Family Tree DNA Family Finder||AncestryDNA||National Geographic Geno 2.0||MyHeritage||Living DNA||Genes for Good||ToTheLetter DNA|