Genetic genealogy Q&A for beginners
From ISOGG Wiki
History of DNA testing
When did DNA testing come into use for genealogical purposes?
In the late 1990s, there were several highly publicized cases, i.e.: the "Cheddar Man", Thomas Jefferson and Sally Hemmings, and the last Czar of Russia's family, to name a few, in which DNA was utilized to prove or disprove relationships to people that have long since been deceased. The media coverage of these, and other cases, helped to bring DNA testing for genealogical applications to the commercial market in the year 2000.
About genealogical DNA testing in general
What kinds of tests are there, and what can they tell me?
There are three types of genealogical DNA: the Y-chromosome (yDNA), mitochondrial DNA (mtDNA), and autosomal DNA (atDNA). Each is inherited in a different way, and each can help with a different aspect of your genealogical research. yDNA is only found in men; in fact, it's what makes a man male. It is inherited from father to son and can be used to track the male-to-male line. A woman cannot do this test, but she can have a paternal relative (like a father, brother, or uncle) do it. mtDNA is only inherited from the mother, but both sexes have it. It tracks the direct maternal line. Finally, atDNA is inherited from both parents and can be used to find connections on both sides. However, it is harder to interpret than yDNA and mtDNA.
How does it work?
Geneticists have identified spots in our DNA that change slowly, such that people who are more closely related are likely to have the same DNA signature, while others will probably have a different signature. The tested regions of the Y-chromosome and mitochondrial DNA are passed down intact from generation to generation along gender lines for tens of generations. In contrast, autosomal DNA recombines (mixes) each generation before being passed on. A child receives half of his or her DNA from each parent, but each parent's DNA is a mixture from the two grandparents on that side; any given marker comes from only one maternal grandparent, for example, and the other maternal grandparent's marker is lost. The same thing happens on the paternal side. As a results, although autosomal DNA can give you information about all sides of your family (not just the direct gender lines), it is most effective for only 4-5 generations back in time.
What will a DNA test tell me?
If you share DNA signatures with other people (matches) in the database of the testing company, you will see a list of their names (or aliases). Depending upon the company you test with, other information may be made available to you, like your ethnic origin or whether you have a Cohanim gene or have Native American ancestry.
Is the DNA test a blood test?
No. Most commercial DNA testing companies use a kit that contains a swab or a spit kit. Just follow the directions enclosed with the kit to swab the inside of your mouth or spit into the container and mail the completed forms and kit back to the company. There are two companies with unique DNA collection methods. One uses a special mouthwash and the other uses chewing gum.
If I had a blood transfusion in the past, would that alter my DNA?
No. But if it was recent, then it's suggested to wait at least a month.
Will chemotherapy or radiation affect my DNA results?
Cancer treatments do not seem to affect DNA results. However, it's recommended that you wait a few months after treatment before collecting a sample for testing.
Can my DNA sample be used to clone me?
How can I preserve my DNA for future testing options?
Many genetic genealogy DNA companies offer long-term storage of your samples. Another option is "DNA banking/archiving" where donors donate samples accompanied with directives. (Source)
Does the Y-chromosome recombine (mix) with the X-chromosome?
The Y-chromosome does recombine at the tip, but the Y-DNA used for testing is from the non-recombining part.
What if I don't have any matches in the database?
Sometimes you just need to be patient. As more and more people discover the benefits of genealogical genetic testing, the databases will continue to grow.
What do the different markers mean?
The first 12-markers are considered "deep ancestry" markers and may show matches with people of different surnames. These matches generally indicate that you shared a common ancestor hundreds of years ago before surnames came into use.
Do I have to exhume my ancestor to get his DNA?
No. Besides being expensive and full of legal and/or societal issues, it's not necessary. Because the Y-chromosome is passed down through the male line from generation to generation relatively unchanged, a close match with another male will lead to identifying a common ancestor. You just need to have a male in your family with the surname you want tested to take the test.
I have a genealogical "brick-wall", how do I identify a common ancestor?
The more people that test and match in a surname project, the more data there is available. Mutations excluded, the ancestral haplotype will begin to be revealed. Hopefully, there will be others in the project that may have the paper trail genealogy you need to get you past your "brick-wall".
Which test should I take?
That depends upon your goals and your finances. If your goal is to obtain the closest genealogical match possible, especially in the case of a "brick-wall" ancestral line, then begin with the test with the most markers available. If your finances won't allow for a more expensive test, then begin with the most markers that you're able to afford, and upgrade later.
About Alleles, Markers and Mutations
Understanding how the testers arrive at the number of an allele and understanding why some are low and others are high would be helpful.
How to assign a numerical value to a marker is defined by the person who discovered that marker. The numerical value represents the number of times a DNA code repeats itself at that marker. For example, if the repeating code is CAT, and we see at that marker CATCATCATCAT, then the result is 4. The number of repeats can increase or decrease whenever a mutation either makes an extra copy of the code (CATCATCATCATCAT) or drops a copy (CATCATCAT). When it comes to why some markers have lower number of repeats and some have a larger number, what we do not know is more than what we know.
What is the difference between markers and genes?
Markers are essentially junk DNA filling space between the actual genes.
What is the significance of the genetic difference between slow-moving marker mutations and fast-moving marker mutations?
The difference is how often we expect this type of mutation to take place. Since a fast-moving marker is more likely to mutate, we can expect it to have happened more recently. Slow-moving markers are much less likely to mutate, so if it does mutate we have to allow more generations for it to have happened. Of course, any mutation can take place in any generation, but some are more likely than others. If the difference between person A and person B is a mutation on a fast-moving marker, and the difference between person C and person D is a mutation on a slow-moving marker, then our predictions of when the common ancestor would have lived would indicate a more recent common ancestor for A and B than for C and D.
Is there a test now, or in the foreseeable future that is the only (or final) Y-DNA test that a male will ever need to take?
There are about 60 million bases and only about 23 million have been sequenced, as compared to mtDNA which has all 16,569 sequenced. The Y Chromosome Consortium has a table listing of all published markers and primer information. What additional markers that may benefit us for genealogical purposes has yet to be seen.
Since less than half of the Y chromosome has been sequenced, does that mean we don't know much about how it compares to other parts the DNA?
The remainder of the Y may never be sequenced. It's called heterochromatin, which is full of extremely repetitive junk DNA that seems to stay tightly coiled up all the time (so the DNA is never used to produce a protein). There's not much motivation to study, because it's difficult and the payoff would be low.
I've read where a father and son can be one marker mismatched on the 12-marker test. Is that generally on one particular marker?
The mutations occur randomly and can be on any marker, although some markers are more likely to mutate than others.
Do mutations of certain markers have any significant meanings?
Some markers are more likely to mutate than others, and which marker mutates has an impact on calculations of how closely related two individuals may be.
How many mismatches could one expect between brothers, and also first cousins in the male line? 12 markers? 37 markers?
One expects brothers and first cousins to have exactly the same results out to 37 markers, although it is possible that a mutation will happen recently enough to cause two brothers or first cousins to have slightly different results.
How does a DNA lab tell where a 391 marker begins along the bases so they can count the repeats, and where it ends and the next marker begins? Is there some sort of a signpost that tells them that is DYS391 and not some other one?
Each marker is located in a specific place on the Y-DNA. When the marker is discovered and submitted, the researcher also provides a primer which is capable of locating the beginning of the marker. The technician can then count the number of repeats of the specific code beginning at that place.
Is it possible to determine which X chromosome I got from my father?
Not without a lot of additional information. When your DNA is analyzed, you will have two results (alleles) for every marker, but it's not possible to tell which allele came from each parent. If you can analyze your father's DNA, that would answer the question immediately. If you don't have your father's DNA, but you have a lot of sisters, you might be able to figure out which allele came from your father. All of your sisters must have the same allele from your father, but your mother can pass on one of her two alleles, so you there's a random effect: you might or might not match your sisters on the maternal version.
Mitochondrial DNA (mtDNA)
Regarding my mtDNA results, is the mutation from the Cambridge Reference Sequence (CRS) 16519C? And if so, am I correct in assuming that the T in front of the number is the CRS (sample?
Yes, the Cambridge Reference Sequence (CRS) has a T in that position. A mutation at 16519 is quite common in different haplogroups, so it's a "hotspot" - a position which has mutated independently a number of times, called a "parallel mutation."
Do the CRS sequence numbers 10,343 and 12,564 (examples) mean they are more ancient markers than a 15,999?
No, they're no older or younger. The numbers refer to what letter in the mtDNA sequence we're referring to - in other words, they tell you exactly which nucleotide in the mtDNA sequence where the mutation has occurred. There are 16,569 base pairs (or nucleotides, or more simplistically DNA code letters) in mtDNA, so they are numbered 1 through 16569.
Refining matches and paper trails
Why it is so important to get three people to test on each line?
Definitely understanding what to do next after you have the data... how can one use the data to help locate the paper trail? What other testing can be done to help with that? Would I need to test another branch? How can that help?
If you don't have a close match, then you need to be patient and wait for one. If you do have a match, then contact the person and, hopefully, they'll have a better paper trail. Test with the most markers available to refine the match as much as possible. Testing another branch might also be helpful for comparing the mutations.
My father's Y-DNA haplogroup is R1b. As his daughter, does that mean my haplogroup is also R1b?
No. Since you are female, your father did not pass on his Y-chromosome to you (or otherwise, you'd be male since the "Y" determines male gender). Your father also did not pass his mtDNA on to you, you get your mtDNA from your mother. The haplogroup assignments for males and females are different.
My haplogroup is estimated R1b, but Q also shows up in my list of matches. Does that indicate my Native American heritage?
No. It is possible that two unrelated results may have experienced mutations or changes in the past which make them appear to be more closely related than they actually are. If your haplogroup is R1b, then you do not share a common ancestor with someone who is a Q for tens of thousands of years. If there is Native American ancestry somewhere else in your family tree, the test cannot pick it up.
Does the last sentence of the previous answer mean that if there is a Native American relative that isn't on the direct lines that test Y-DNA and mtDNA, then the test can't pick it up?
Yes, the test will not pick it up unless it is the source of the father's father's father's line. The haplogroup of that line will not have changed from Q to R1b. If there is a Native American with haplogroup Q somewhere in the family tree but not on the direct paternal line, the Y-DNA was not passed down from that person to your paternal line and the test does not pick up the Native American ancestry at all.
What does a "+" or a "-" mean after P40? I've seen P40- and P40+, for example.
The + means you tested positive for that (single nucleotide polymorphism) (SNP - pronounced "snip") mutation and the - means you were negative for that marker mutation. P40 is the marker's name.
Does a P40- mean that there was no P40 marker at all, or that it existed but there were no repeats for it?
The P40 marker is a single nucleotide on the Y chromosome which exists in two states ... the original state (for example a C) and the mutated state (for example a T). It does not have a repeat pattern. It is a single nucleotide polymorphism, i.e., a SNP. If the male tested and was found to have the original C value at that SNP location he would be P40-. If he had the mutated value of T at that SNP location he would be P40+
Repeat patterns are associated with Y-STR markers, not SNP markers. Y-STR markers contain multiple nucleotides in a row in a pattern such as: GATAGATAGATAGATA which is a 4 count repeat of the nucleotide pattern/motif GATA.
Family Tree DNA Specific
On the personal page at FTDNA, where you update your contact information, it asks for your maternal and paternal origins. The problem I have with this is that we have two grandparents on each side and your page only provides for one answer. For example, my father was born of parents who were of German origin (his mother) and Scottish origin (his father). Similarly, my grandparents on my mother's side were French (her father) and German (her mother). So why is one half of our "most recent" ethnic makeup missing?
There is only one blank for each side because the Y-DNA test only traces the father's father's father's side and the mtDNA test traces only the mother's mother's mother's side. As a result, those two are the only ones which we might use the country of origin for.
Does the one- and two-step mutations in the recent ancestral origins mean that we are related at some time in the past?
Possibly, but not as closely as an exact match. The closer they are to your results, the more closely and more likely you are to share a common ancestor at some point. There is also the possibility that your most recent common ancestor lived thousands of years ago and your results only happen to look similar today.
What can those with African ancestry learn about their roots?
Individuals with African ancestry can confirm the African ancestry and can sometimes learn more about their general geographic origins in Africa. However, because tribes and countries all have people with a variety of DNA results and people have intermarried or moved from one country to another over time, we cannot determine the specific tribe or country of origin.
What can those with Jewish ancestry learn about their roots?
Test results can be compared with Jewish databases to determine whether individuals may have Jewish ancestry. A certain haplotype for the Y-DNA test has been determined to indicate possible Cohanim or Levi origins, and both tests can look for Ashkenazi and Sephardic origins.
If two people share an identical HVR-1 panel, what's the approximate TMRCA (Time to Most Recent Common Ancestor)?
This is very hard to estimate, because of the variable mutation rates in different parts of the mtDNA molecule. FTDNA gives 52 generations as the median - that is, 50% of people who match in HVR1 would find their common ancestor in 52 generations or less, while the other 50% would have to keep on looking.
If the HVR-2 panel matches as well, what's the approximate TMRCA?
FTDNA gives the median number as 28 generations.
N.B. The Sorenson Molecular Genealogy Foundation database was withdraw on 14 May 2015, and is unlikely to be made available again for public searches.
How do I extract my results from smgf.org?
You can determine the values by changing the input values for the non-matching markers, and then viewing the results until you see that they match.
Does "DOE[USA] x6" really mean that the database has samples from 6 different DOEs?
No, each row represents one person's pedigree, with birthplaces consolidated. The 6 is for eight generations, all in the USA. A different pedigree for five generations might read DOE[USA]x3, DOE[ENGLAND]x2. You can get the total count of records for a certain surname on this page: http://smgf.org/paternal_surnames.html
Thanks to ISOGG members and E.K., C.M & A.T. for Q&A contributions!