Displaying Y-DNA results
From ISOGG Wiki
The display of Y-DNA results of all the project participants on a public web page is one of the most important responsibilities of a surname DNA project administrator. This may seem obvious, but the reasons are worth enunciating:
- It allows project participants to see the context of their own results
- It helps with the recruitment of new participants
- It helps educate all about the surname, its origins and its DNA characteristics
- It testifies the accountability of the project administrator.
1. Some projects, typically those founded pre-2005, do not display individual test results to the public. This may be because of undertakings given to participants due to concerns over confidentiality, privacy etc. But provided the ISOGG Project Administrator Guidelines are complied with it is now generally recognised that public webpages are more likely to promote recruitment.
2. This page should be read in the context of the wider issues of the responsibilities of administrators of surname projects and of project web sites, and of presenting their project's mitochondrial DNA and autosomal DNA test results.
3. This page assumes all or most of the Y-DNA test results have been taken with Family Tree DNA. Some of the features discussed will be relevant to results of other testing companies.
Administrators have two main options:
(a) FTDNA public pages. Once a surname project is established under the aegis of FTDNA (or other testing company), all the Y-DNA STR test results of project participants are automatically displayed at http://www.familytreedna.com/public/[projectsurname]/
(b) Private website with project administrator's own spreadsheet table.
Note that with all these display options the project administrator remains responsible for choosing how they group their participants' test results – see here.
Pros and cons of above options
Option (a) has the advantages of always being "up to date". There is very limited flexibility in the display format, and also non-FTDNA test results are excluded.
Option (b) has the advantages of having almost limitless flexibility in format and presentation – examples are developed below. But option (b) requires special skills to set it up and has other disadvantages
- It is more demanding of the administrator’s time if the results are kept reasonably up-to-date
- The project "freezes" if the administrator becomes indisposed or succumbs to other pressures on his time; to guard against this an assistant administrator is recommended
- The results are liable to introduce copying errors
- The results have to be periodically uploaded to the relevant website after updating and amending.
While it is beneficial to have surname project pages in several websites (FTDNA, private website, ISOGG Wiki surname DNA Project pages, one-name study website, Clan Association website, social network websites etc.)that all help to publicise the project, it is neither practical nor desirable to maintain detailed test results in more than one such forum in addition to the FTDNA public pages.
FTDNA public pages
FTDNA’s public pages Y-DNA Results for each surname project, with guidance on their interpretation, appear at:
On these pages a project/study is termed a "group", and a "group" as discussed above is termed a "Subgroup". FTDNA do not enable further subdivisions to be made.
Some limited editing flexibility is available: At Project Administration/Member Subgrouping the project administrator can select an unlimited number of subgroups, edit subgroup names and descriptions and select a background colour, and assign participants to any subgroup as they wish. For each subgroup FTDNA automatically calculate minimum, maximum and mode values for each marker, and provide a colourised version to highlight mutations of individual participants that differ from the mode. However, within each subgroup individual participants will always appear sequenced in marker value order, as illustrated in Example 1, and the administrator cannot sequence/sort them into any other order:
At Project Administration/Public Results Display Settings the project administrator can select columns that display: (i) the member's last name; (ii) most distant ancestor; (iii) member's last name and most distant ancestor; (iv) None of these. The details shown in these columns are taken from the participants' personal pages and, with the exception of the most distant ancestor field, cannot be changed by the administrator. Additional columns cannot be introduced.
On the same page the administrator can switch on/hide their project statistics, maps, Y-DNA SNP results and mtDNA results for individual project members.
Private project website
The benefits of a private project website can be considerable, but many project administrators will require assistance in establishing the initial setting up and design processes.
Some of the things you cannot change on the FTDNA display may be changed after downloading an XML spreadsheet, savable as XLS. You can then insert columns of your preference, containing whatever suits your fancy and sort on those columns
The various pages in a private project website that its administrator may use to fulfil his responsibilities are discussed at ISOGG Project Administrator Guidelines and Promoting your DNA project. Here we are only concerned with the Y-DNA Results page.
The core of this page is an Excel spreadsheet. For a large project and/or ambitious spreadsheet Excel 2000 or newer is necessary. Imaginative use of fonts, colours etc. is desirable, but no knowledge is needed of Excel’s mathematical tools.
As many columns may be used as needed to record a wide range of data, labelled and sequenced as the administrator prefers. Columns for the following data should be considered:
- Participant data: Kit no.; Ysearch ID; project ID; code for country of residence
- Earliest confirmed patrilineal ancestor: surname, first names; birth place and date, death place and date; reputed ancestral origin
- Test analysis overview: No. of makers tested (a contrasting font may be used for tests being upgraded); genetic distances from project/genetic family modal haplotype: TiP Score from project/genetic family modal haplotype
- Test results: Haplogroup/SNP: predicted haplogroup in red; terminal SNP (if tested) in green. Haplotype: 1-12 marker panel; 13-37 marker panel; 38-67 marker panel; 68-111 marker panel. NB DYS names should be written vertically to keep table compact.
Keeping confidential data on the same spreadsheet is very convenient to the administrator, but great care must betaken to ensure this data is not promulgated by mistake.
- Participant data: Full name; e-mail address; date of joining. Brief notes on special features. Dates of emails.
- Additional data, e.g.non FTDNA markers GD/Tip Score analyses.
- Table headers: Project title and date of latest amendment. Column titles.
- Grouping headers: Group title; contrasting background colours may usefully be used. Group minimum and maximum values. Group modal values.
- Individual lines for each participant: data should be transferred by "cut and paste" to minimise errors. Use scaled colours to show differences from mode/modal participant.
e.g. dates of joining project and of expected completion of test; participants removed from project; experimental groupings etc.
3. Ordering of participants within each grouping
This may be by:
- (a) Ascending sort order of marker values (the only ordering possible with FTDNA public pages), as illustrated in Example 1 above;
- (b) Date of joining project, or Kit Number
- (c) Resolution, i.e. 111-marker test results first, 12-marker test results last (if accepted at all)
- (d) Surname spelling
- (e) Proximity to modal values (see matching and grouping in surname DNA projects, as measured by genetic distance for the relevant test resolutions, and sequenced with the smallest genetic distances first and the largest last, as illustrated in Example 2.
- (f) Proximity to the modal participant (see matching and grouping in surname DNA projects, as measured and sequenced by TiP Scores, immaterial of test resolution, as illustrated in Example 3.
Note that with both (e) and (f) it will occasionally be necessary to recalculate the modal values or reassign the modal participant, as the number of members of the genetic family grows and test results are upgraded to higher resolutions.
Examples of other possible features of Y-DNA result tables on private project websites maybe found at www.dnastudy.clanirwin.org.
It should be noted that with private websites:
- Their creation requires different skill sets to that of genetic genealogy
- Their maintenance requires an additional time commitment from the administrator
- A major change to a website format involves further commitment
It follows that care is needed to "get it right first time", and that some administrators present their project's results in a form that they would like to improve if they had the necessary time/ability.
Other forms of presentation on private websites
In addition to the main Y-DNA Results table, however displayed, project administrators may use various supplementary data and analyses, including:
Diagrams that illustrate DNA data graphically are known as cladograms. Cladograms relevant to surname projects include:
- (a) A project-specific phylogenetic tree, adapted from the ISOGG Y tree, illustrating the pre-surname era connections between the different genetic families of the surname; this will develop as more SNPs are tested by project participants
- (b) A manually constructed mutation history tree, illustrating the perceived understanding of sub-groupings within specific genetic families
- (c) Cladograms using computer-generated network analysis programs such as Fluxus, PHYLIP, etc.
2. Genealogical data
One of the roles of a surname project administrator is the collection of genealogical data to complement the Y-STR test results. Where this is taken beyond the simple listing of details of the earliest known paternal ancestor, and indication if a GEDCOM has been tabled, a separate database of the male ancestors of each project participant should be maintained and displayed. To keep this manageable as the project grows in size, only details of the male ancestral line need be recorded, and for confidentiality reasons details of living representatives should be excluded. For a good example see http://www.phillipsdnaproject.com/ylineage/ylineage-main.
3. Statistical data
Further forms of presentation that may usefully be incorporated in periodic updates supporting the main project results include:
- Statistics/histograms of project growth (FTDNA "Project Joins");
- Statistics/histogram of spellings of surname
- Statistics on project penetration
- Statistics/mapping of participants' place of residence (FTDNA "Member Distribution Map2)
- Statistics/histogram of earliest patrilineal ancestors' origin (FTDNA "Country of Origin Charts")
- Statistics/histogram of earliest patrilineal ancestors' date of birth
- Statistics/histogram of resolution of markers tested
- Statistics/summary of genetic family size, SNPs, etc.