Listing Criteria for SNP Inclusion
into the ISOGG Y-DNA Haplogroup Tree - 2014

The entire work is identified by the Version Number and date given on the Main Page.   Directions for citing the document are given at the bottom of the Main Page.
Version History     Last revision date for this specific page: 30 April 2014

LINKS:  Main Page   Y-DNA Tree Trunk   SNP Index   Papers/Presentations Cited   Glossary  

Introduction

These recommendations are to assure that there is a uniform set of criteria for accepting new mutations for inclusion on the ISOGG Y-DNA haplogroup tree.

Because of the abundance of alternatives now available, only single nucleotide polymorphisms (SNPs) are being accepted, and not insertions or deletions (indels) for new additions. In exceptional cases other variants may be considered for inclusion on a case by case basis if they can be clearly demonstrated to have equivalent properties to SNPs, but the burden of proof required will be much higher and at the discretion of the committee.

Special Coding for Interpreting SNP status

This coding may need to be modified when the tree moves from .html coding to a database.

General Requirements for SNP Validation

The requirements listed here in this General Requirements section apply to validating SNPs discussed in Requirements of Specific Type of Testing in the next section below.

  1. Inserting a SNP Creating a Non-Terminal Branch to the ISOGG Tree
    The supporting information provided by the proposer should demonstrate that the new SNP is downstream of an established tree mutation. There is need also to show that the SNP was tested in individuals from all parallel subgroups on the tree.. In cases where relevant existing tree subgroups are from rare populations and based solely on old research listing only one sample proving the existence of the SNP, an exception may be granted for testing of the old subgroup. The mutations of the existing subgroup will then be listed temporarily as position undetermined.

       Example: Suppose that a new subgroup is being added with name of Q18.
          Fictional example:
             G-L140
                G-L13
                G-L1266
                G-Q18
                   G-L1268

    Then the evidence for Q18 must show that a man is derived for both Q18 and L140. Simultaneously one man each from L1266 and L13 must be ancestral for Q18. In addition, one man derived for Q18 must be derived for L1268, and a second Q18 man ancestral for L1268. Derived means the mutation is present; ancestral means it is absent.
     
  2. Adding a SNP Representing a New Terminal Branch to the ISOGG Tree
    In the case where the new SNP is the terminal branch of an existing branch then:
    1. at least one individual who has the new SNP is found also to have a SNP defining the immediate upstream subgroup.
    2. at least one individual from any parallel subgroup to the new subgroup is found also to lack the submitted SNP.

       Example: Suppose that a new subgroup is being added with name QQ12.
          Fictional example:
             G-L5432
                G-P343
                G-QQ12

    Then the evidence for QQ12 must show that two men are derived for QQ12. Simultaneously one man from P343 must be ancestral for QQ12. Also, one of the QQ12 men must be derived for L5432.

Requirements for Specific Type of Testing

Reference giving details about Y-DNA SNP testing companies:
   Y-DNA SNP testing chart
   YSEQ
Reference giving details about Y-DNA STR testing companies:
   Y-DNA STR testing chart
  1. Sanger Sequencing
    Examples of Sanger sequencing are the tests at the company ySeq and the Advanced Tests (SNP) at Family Tree DNA. STR testing is available, for instance, at Genebase and Family Tree DNA. Acceptable testing for this category consists of Sanger sequencing.

    The objective of the ISOGG Tree at this time is to include all SNPs that arose prior to about the year 1500 C.E. This guideline may be measured through STR diversity or alternative evidence.

    Where a new terminal subgroup is being added, STR marker results or other evidence described below for two men with the new SNP are needed.

    STR Diversity
    To be accepted the SNP must be observed in at least two individuals and must meet the STR diversity requirement. A SNP that does not meet this requirement will be classified as a Private SNP (see definition above).

    The STR diversity requirement is met if the following conditions are satisfied:
    1. If the SNP is a Non-Terminal Branch SNP, no further proof of diversity is required.
    2. Genetic distance is calculated using the Infinite Alleles Model (IAM). A marker for which there is a null value in one sample must be discarded from the calculations. Otherwise, most laboratories use the IAM.
    3. All markers tested by both individuals must be compared.
    4. If 74 markers (or fewer) are compared, the minimum genetic distance to meet the diversity requirement is 5.
    5. If 75 (or more) markers are compared, the diversity requirement is a minimum of 7%, computed by dividing the genetic distance by the number of markers compared, and rounding to the nearest integer value.

    Alternative Evidence
    If the submitter can otherwise provide evidence that the common ancestor of the two samples can be reasonably expected to have lived more than 500 years ago, this will also be considered.
     
  2. Next Generation Sequencing
    Next generation sequencing is available for the genealogical community at Full Genomes Corp. or as the Big Y test at Family Tree DNA.

    1. The committee recognizes there are a wide variety of ways in which sequencing information is available. Because of this, no specific criteria for sequencing information is provided here. The goal of the reviewers of the sequencing submissions – at one extreme – will be to easily accept quality SNPs from old, root branches found in many samples within all the downstream branches. At the opposite extreme, it is unlikely reviewers will accept SNPs near or in terminal branches whose positions depend on the results from one sample.
    2. The submitter must provide the raw data report(s) pertaining to the sequencing. Just two examples of raw data reports would include a vcf file showing the usual quality scores, DP scores for depth of reads, etc. for the involved sample and pertinent additional ones, including ones from other haplogroups OR instead the so-called “haplogroup compare report” from Full Genomes Corp. Results from Sanger sequencing or from microarray products, such as Geno 2.0 or Chromo 2.0, might be acceptable comparative information in certain cases. Having a large number of pertinent comparative samples on a vcf report, can improve the scoring information.
    3. The reviewer will have to take into consideration the coverage of the next generation sequencing, varied quality scorings, position of the site on the chromosome, the percentage of samples with clean reads at the site in question, possible indel relationships to the SNP, geographical separation of the samples, non-next generation sequencing testing, results for the SNP site in other reports, and other factors in making a complex judgment as to whether the submitted SNP is almost certain to show the same results in next generation sequencing of new comparable samples.
    4. More precise criteria for next generation sequencing submissions may be provided as evidence accumulates.
    5. When a new SNP creating a new terminal branch is being added to the tree, at least two of the submitted samples must each have an average of 3 unique (singleton) SNPs per 10 million base pairs of sequencing coverage. Reviewers will determine uniqueness according to comparisons to all available sequencing results rather than samples tested at a particular laboratory.
    6. If the evidence for the SNP is based solely on next generation sequencing, the SNP will appear in italics on the tree.

  3. Microarray Chip-based Genotyping
    Examples of microarray chip-based genotyping are Geno 2.0, 23andMe, Chromo 2.0.

    1. Novel SNPs found in microarray products without a presence also in other qualifying sources - such as Sanger sequencing or next generation sequencing - cannot be submitted. However, chip-based genotyping results can be used in combination with Sanger sequencing and/or next generation sequencing results as validating evidence for one of the samples. If chip-based genotyping is part of the evidence, the approved SNP will be listed in regular type, rather than italics, even if the other evidence is from next generation sequencing.
    2. Samples from chip-based genotyping used to prove a new terminal branch must meet the criteria for STR diversity described in the Sanger sequencing section.

Acceptance Process for Placing a SNP on the ISOGG Y-DNA Haplotree

The discoverer of the SNP (or a knowledgeable third party) can email the Contact Person listed on the appropriate haplogroup page and describe where the new SNP fits in the tree. The haplogroup experts will evaluate the evidence for inclusion on the tree. If the information on tree placement is insufficient, it will be listed as investigational in the section under the tree. If the Contact Person is not available, contact Alice Fairhurst.

Corrections/Additions made since 1 January 2014:

Back to Main Page
Back to Y-DNA Tree Trunk
Back to SNP Index
Back to Papers/Presentations Cited
Back to Glossary

Copyright 2014. International Society of Genetic Genealogy. All Rights Reserved.

   
ISOGG logo