From ISOGG Wiki
This is a glossary of terms commonly used in the study of genetics and genetic genealogy. For more specific details, please see the article corresponding to each term where available.
|Top · 0–9 · A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
- Adenine: One of the four nucleotide bases in DNA or RNA; pairs with thymine in DNA or uracil in RNA.
- Allele: An allele (pronounced UH-leel is one of multiple alternative forms of a single gene, each of which is a viable DNA sequence occupying a given position, or locus on a chromosome. For example, in humans, one allele of the eye-color gene produces blue eyes and another allele of the eye-color gene produces brown eyes.
- Administrator: Also known as a Project Administrator, Group Project Administrator, Project Manager, Coordinator and Co-Coordinator. A volunteer who establishes a DNA study with one or multiple commercial DNA testing companies.
- Affected relative pair: An affected relative pair consists of two organisms related genetically that are both affected by the same trait. For example, two cousins who both have blue eyes are an affected relative pair since they are both affected by the allele coding for blue eyes.
- Ancestral haplotype: The haplotype of a MRCA deduced by comparing descendants' haplotypes and eliminating mutations.
- Ancestral state: Refers to the state of a SNP that has mutated and is shared by the most people. Example: A negative result on a SNP means it is ancestral, a positive result means it is derived.
- Ancestry-informative markers
- Ancient DNA
- Anthrogenealogy: A term coined by Family Tree DNA combining the words anthro and genealogy in reference to utilizing DNA to trace one's heritage far beyond recorded documentation.
- Atlantic Modal Haplotype (AMH): See Western Atlantic Modal Haplotype.
- Autosomal DNA: The DNA of non-sex-determining chromosomes that mix or recombine. Also known as admixture DNA.
- Base pair: A pair of nucleotide bases on complementary DNA or RNA strands organized in a double helix.
- Bikini haplotypes: Minimal haplotype data, i.e.: six Y-STR markers.
- Biogeographical ancestry (BGA)
- Buccal cell: A type of cell found in cheek tissue inside the mouth.
- Build: The term used for the human genome reference standard (Build 38 is the most recent reference genome).
- Cambridge Reference Sequence (CRS): The first mitochondrial DNA to be fully sequenced at Cambridge University in 1981.
- CCR5: A gene located on chromosome 3.
- Chromatin: A substance consisting of both DNA and protein. The major proteins in chromatin are called histones; these help package the DNA in a compact form that fits inside the cell nucleus. The DNA is wrapped and re-wrapped into a tight coil, and the resultant arrangement is called chromatin. Changes in chromatin structure are associated with DNA replication and gene expression (see also epigenome).
- Chromosome: A molecular "package" for carrying DNA in cells, organized as two double-helical DNA molecules that encode many genes. Some simple organisms have only one chromosome made of circular DNA, while most eukaryotes have multiple chromosomes made of linear DNA.
- Chromosome browser
- Chromosome mapping
- Clademates: the other people who have tested down the branches into the same descendant subclade that you have.
- Coalescence age: The merging of genetic lineages backwards in time to the most recent common ancestor (MRCA).
- Coding region: A region of DNA which contains genes.
- CODIS: Acronym for Combined DNA Index System - the FBI's autosomal STR DNA database for profiles of criminal offenders.
- Cohanim Modal Haplotype The Y-DNA haplotype most commonly found among males with an oral tradition of Cohen ancestry.
- Cohen: The Hebrew word for priest which refers to a direct male descendant of Aaron, the brother of Moses; plural: Cohanim.
- Complementary sequences Opposing strands of DNA which bond together to form the double helix. The bases always complement one another with adenine and thymine pairing together and cytosine and guanine pairing together.
- Convergence: The process of two unrelated or less related haplotypes changing over time to resemble one another.
- Crossover: See: Recombination.
- Crossover interference: A mechanism during meiosis that results in a non-random placement of chromosomal crossover locations in relational position to each other. The term was coined by Hermann Joseph Muller (1890-1967), who observed that a given crossover at a specific location "interferes with the coincident occurrence of another crossing over in the same pair of chromosomes." This also is believed to be the underlying function that explains why larger chromosomes undergo crossover with greater frequency during meiosis than do the smaller chromosomes.
- Cytosine: One of the four nucleotide bases in DNA or RNA; pairs with guanine.
- D9S919: A Y-STR marker on chromosome 9.
- Deoxyribonucleic acid (DNA) A chemical consisting of a sequence of hundreds of millions of nucleotides found in the nuclei of cells containing the genetic information about an individual. It is shaped like a double-stranded helix, which consists of two paired DNA molecules and resembles a ladder that has been twisted. The "rungs" of the ladder are made of base pairs, or nucleotides with complementary hydrogen bonding patterns.
- Derived state: Refers to the state of a Y-chromosome SNP that has mutated, usually in one man, from the ancestral state and created a new haplogroup or subclade of a haplogroup. A positive SNP result is derived; a negative SNP result is ancestral.
- DNA amplification: The production of many DNA copies from one or a few copies or fragments.
- DNA Newbie: Someone who is new to the field of genetic genealogy. It is also the name of a Yahoo mailing list forum sponsored by the International Society of Genetic Genealogy.
- DNA signature See haplotype.
- DNA replication: The process by which the DNA double helix makes a copy of itself or of a fragment of itself. It uses the old DNA as a template for the synthesis of new DNA strands. In humans, replication occurs in the cell nucleus.
- DNA sequencing: The process of determining the exact order of the nucleotide bases in a segment of DNA.
- Double helix The twisted shape DNA forms when its two strands bond together. A double helix looks like a twisting or rotating ladder.
- DYS: Acronym for DNA Y-chromosome Segment - the assigned number of a marker on a segment of the Y-chromosome. Example: DYS# 393. It is assigned based on a nomenclature system controlled by the HUGO Gene Nomenclature Committee, which assigns DYS numbers to newly discovered markers.
- DTC: Direct to Consumer DNA test kits
- Endogamy: The practice of marrying within the same ethnic, cultural, social, religious or tribal group.
- Enzyme: A protein that facilitates a specific chemical reaction by working as a catalyst.
- Equifinality: The process whereby different evolutionary histories can give rise to the same patterns in genetic or archaeological data.
- Exact match: Two individuals with exactly the same results for all markers or regions compared.
- FASTA: A text-based file format used to present either nucleic acid or amino acid (protein) sequences using single-letter abbreviations. Common filename extension are .fasta, .fna, .ffn, .faa, .frn, and .fa. Read more at the National Center for Biotechnology Information (NCBI).
- Fixation index (FST): A measure of genetic distances between populations. The closer to zero the less the distance.
- Forensic genealogy
- Founder effect
- FTDNATiP: Acronym for Family Tree DNA Time Predictor - a program created to calculate the time to the most recent common ancestor using mutation rates specific to each marker.
- Full genomic sequence (FGS): The former name used by Family Tree DNA for a full mitochondrial sequence test.
- Full mitochondrial sequence (FMS): The name given by Family Tree DNA to a mitochondrial DNA test which sequences the entire mtDNA genome comprising all 16,569 base pairs.
- Fully identical region
- GAP: Acronym for the Group Administrator Page. This is a webpage in which a DNA Project Administrator utilizes functions such as creating a public website, generating a FTDNATiP report, etc. to assist project participants in coordinating results.
- GEDCOM: Acronym for Genealogical Data Communications - A plain text program created for exchanging genealogical data between different genealogical programs.
- Gene: A segment of DNA which contains the genetic code to make a certain protein or part of a protein.
- Gene expression: The process in which the information encoded in a gene is converted into a form useful for the cell. The first step is transcription, which produces a messenger RNA molecule complementary to the DNA molecule on which a gene is encoded. For protein-coding genes, the second step is translation, in which the messenger RNA is read by the ribosome to produce a protein.
- Gene pool: The sum of all the alleles shared by members of a single population.
- Genealogical timeframe
- Generation The number of years between the birth of the parents and the birth of their children. Different studies use different numbers of years per generation. See: Generation length.
- Generation length
- Genetic ancestry
- Genetic anthropology
- Genetic cousins: Individuals whose Y-DNA, mtDNA or autosomal DNA test results match one another.
- Genetic determinism: The belief that genes alone determine the physical and mental characteristics of a child.
- Genetic distance: The number of differences, or mutations, between two sets of results. A genetic distance of zero means there are no differences in the results being compared against one another (exact match).
- Genetic drift
- Genetic exceptionalism: The belief that genetic information is special and needs to be treated in a different way from other medical or genealogical information.
- Genetic family (also known as a genetic group.
- Genetic genealogy: The use of DNA testing in combination with traditional genealogical and historical records to infer relationships between individuals.
- Genetic genealogist: A genealogist who is involved in genetic genealogy.
- Genetic linkage: The tendency for small chunks or segments of DNA that are physically near each other on a chromosome to be inherited together and not be separated by crossover during meiosis. An obvious benefit to this mechanism is that protein coding genes are not split apart during recombination. A representation of these correlative linkages is called a linkage map.
- Genetic paternalism: The belief that genetic tests for health purposes should be ordered through a medical professional for the protection of the individual.
- Genetic signature: See haplotype.
- Genetics: The field of biology that studies genes and their inheritance; the study of DNA.
- Genome: The entire complement of genetic material in the chromosome set of an organism, virus or organelle. The human genome is composed of 46 chromosomes, with a total of 3 billion base pairs.
- Genotype: The complement of alleles present in a particular individual's genome that give rise to the individual's phenotype.
- GFF: A standard file format for storing genomic features. GFF stands for Generic Feature Format and the files are plain text, nine column, and tab-delimited. Read more about GFF at Gmod.org.
- Guanine: One of the four nucleotide bases in DNA or RNA; pairs with cytosine.
- Half-identical region: A half-IBD segment where your genotype matches (at least) half of another person's genotype.
- Haplogroup: A genetic population group of people who share a common ancestor on the patriline or the matriline. Top-level haplogroups are assigned letters of the alphabet, and deeper refinements consist of additional number and letter combinations. For Y-DNA, a haplogroup may be shown in the long-form nomenclature established by the Y Chromosome Consortium, or it may be expressed in a short-form using a deepest-known single-nucleotide polymorphism (SNP). Example: R1b1a1 or R-P297. Y-chromosome and mitochondrial DNA haplogroups have different haplogroup designations. Haplogroups can reveal deep ancestral origins dating back thousands of years, or with recent full-sequence Y-DNA testing may be relevant to the last few generations.
- Haplotree: A haplogroup tree. A diagram or chart showing the different lineages within a haplogroup.
- Haplotype: A set of markers (polymorphisms) on a single chromosome that tend to be inherited together. A haplotype can refer to a combination of alleles, to a set of short tandem repeats (STRs), or to a set of single nucleotide polymorphisms (SNPs). Haplotype is a contraction of the term haploid genotype, and is also know as a DNA signature or a genetic signature.
- Haplotype Switching: See: Pseudosegment.
- Holliday junction: A mobile junction between four strands of DNA.
- Hypervariable region (HVR): The sections of non-coding mitochondrial DNA that are used for low-resolution genealogical DNA tests.
- Identical ancestors point
- Identical by descent (IBD): A segment of inherited by two people through a common ancestor without recombination.
- Identical by state: Not identical by descent.
- Imputation: The statistical inference of genotypes that have not been observed. It involves the use of one or a cohort of reference genomes to deduce the missing alleles. Many algorithms for this purpose are based on a Hidden Markov Model or a Markov Chain Monte Carlo approach, but all depend on the quality of the genomic reference panel being used.
- Indel: An insertion or deletion of DNA at a particular location on a chromosome.
- International Society of Genetic Genealogy: A free society founded in 2005 for the promotion and education of genetic genealogy.
- Investigative genetic genealogy
- JoGG: The Journal of Genetic Genealogy - An online journal published quarterly with articles and features pertaining to genetic genealogy and anthrogenealogy.
- Junk DNA: Slang term usually used in referring to the non-coding region of DNA on the Y-chromosome. For more about junk DNA see: www.psrast.org/junkdna.htm
- Male-specific Y: Another name for the non-recombining region of the Y-chromosome.
- Marker: A specific place on a chromosome with two or more forms, called alleles, the inheritance of which can be followed from one generation to the next. In Y-chromosome DNA testing, this refers to non-coding Y-chromosome DNA. Numbers designate the individual DNA segments. Example: 393=13. This means at marker #393, your allele value is 13.
- Massively parallel sequencing: See: Next generation sequencing.
- Meiosis The stage in which sperm and egg cells are formed. It is during this process that the autosomal chromosomes recombine and mutations occur.
- Microsatellite: Another name for a short tandem repeat.
- Mitochondria: A specific organelle in the cell that helps it to produce energy.
- Mitochondrial DNA: Energy-releasing organelles located in the cytoplasm of cells, which contain their own DNA. Mitochondrial DNA is passed from mother to child, but only females continue to pass on their maternal mitochondria to their children.
- Mitochondrial Eve: The common matrilineal ancestor of all living humans.
- Mitosearch: was a free public database sponsored by Family Tree DNA where mitochondrial DNA results from any testing facility could be uploaded and compared. A new free public database is available at www.mitoYDNA.org
- mitoYDNA: mitoYDNA is a free public database where Y-chromosome STR (short tandem repeat) and mitochondrial DNA results from any testing facility may be uploaded and compared.
- Modal haplotype
- Most recent common ancestor (MRCA): The most recent ancestor from whom a group of individuals share descent.
- Mutation: A change in the DNA that occurs spontaneously. Mutation is a scientific term that often connotes a negative connotation as a result of 1950s 'B' movies, but in genetic genealogy, mutations are utilized for distinguishing different ancestral lines. Mutations can also occur due to environmental factors, such as exposure to radiation.
- Mutation rate: The frequency with which random mutations occur.
- Next generation sequencing (NGS)
- Non-coding DNA: Also referred to as 'junk DNA', non-coding DNA is not part of an active gene that contains a code for making a protein. Recent evidence shows that at least some non-coding DNA is involved in biological processes such as regulation of gene expression and chemical signalling among cells.
- No call
- Non-paternity event (NPE): An event which has caused a break in the link between the surname and the Y-chromosome resulting in a son using a different surname from that of his biological father (eg, illegitmacy, adoption, maternal infidelity).
- Non-recombining Y (NRY): The section of the Y-chromosome that is passed from father to son down the patriline. While it does not recombine, it does have mutations over time.
- Nuclear DNA: The DNA of chromosomes found in the nucleus of the cell.
- Nucleotide: One of the four monomers that make up a DNA molecule.
- Null value: A null is a value of zero on a marker. Nulls can occur due to missing genetic material on a marker, or a SNP can sometimes cause a null result. Several Y-STR markers have been identified in certain families to have null results (for example, DYS439, and DYS448)
- One-name study: The study of a single surname.
- Organelle: A cell structure with specialized functions.
- Palindrome: A double-stranded DNA segment in which the sequence of one strand is in the reverse order to the other strand. Example: DYS464X where a family line in haplogroup R1b1a2 has cccc or ccgg instead of the usual cccg pattern.
- Pangenome: A model attempting to describe all genes and genetic variations found within a given species or subspecies. Also known as pan-genome or supragenome.
- Parallel mutation: The same mutation occurring coincidentally in another line of descent from the MRCA (most recent common ancestor).
- Pedigree collapse
- Personal genomics
- Phasing: The task or process of determining the parental source of a SNP's alleles (i.e., determining which parent contributed each specific allele)
- Phenotype: The observable physical or behavioral traits of an organism, largely determined by the organism's genotype.
- Pherogram: For STRs, a plot which shows the length of a fragment of DNA. This allows its allele value to be measured.
- Phylotree: A shortened term for phylogenetic tree. It is most often used in reference to the available online diagrams showing the Y-chromosome and mitochondrial DNA haplotrees. This term is also applied to DNA project diagrams created by Project Administrators utilizing specialized software. Phylotree is also a website which hosts the mtDNA evolutionary tree and a minimal reference version of the Y-SNP haplotree.
- Polygenic risk score
- Polymorphism: See: Mutation.
- Poor man's phasing: The process of phasing by using siblings, aunts and uncles or other close relatives as a proxy for a parent.
- Principal component analysis
- Protein: A linear polymeric molecule made of amino acids linked by peptide bonds. Proteins carry out the majority of chemical reactions that occur inside the cell.
- Proxy: Usually used in reference to the contact person for a DNA test. Example: A female who has tested a male relative.
- Pseudogene: A gene which has lost its function over time.
- Pseudophasing: The process of phasing by inference from population reference samples. Also called statistical phasing.
- Pseudosegment: A false positive matching segment which occurs as a result of matching alleles zig-zagging backwards and forwards between the maternal chromosome and the paternal chromosome. Also known as a false segment, a spurious segment, an erroneous segment or a phantom segment. The data error leading to the problem is sometimes referred to as haplotype switching.
- recLOH: Acronym for Recombinant Loss of Heterozygosity. When a section of DNA on a marker is missing, that marker is sometimes repaired by another marker filling in the missing DNA with its own material. This is referred to as a "recLOH event" and is usually observed with multicopy markers like 385a and 385b, and is also common in the 464 set. The recLOH event causes the allele values to match 11-11 instead of the more common, 11-14 that you see in R1b.
- Recombination: An event occurring during meiosis - the formation of sperm and egg cells. One chromosome from the mother and the other from the father break and trade segments with one another.
- Reconstructed Sapiens Reference Sequence: The inferred mitochondrial DNA ancestral sequence.
- Replication: See: DNA replication'.
- Restriction enzyme: A protein that recognizes a certain sequence of DNA and cuts the DNA at that site.
- Restriction Fragment Length Polymorphism (RFLP): See single nucleotide polymorphism.
- Revised Cambridge Reference Sequence: A revised version of the Cambridge Reference Sequence. Mitochondrial DNA results are compared against the rCRS.
- Search angel: A genealogist who volunteers his or her time to help others. In genetic genealogy the term is most often used in the context of DNA adoption research.
- Second generation sequencing: See: next generation sequencing.
- Sequencing: See: DNA sequencing and next generation sequencing.
- Sex chromosome: The X-chromosome or Y-chromosome. Normally males have one X and one Y and females have two Xs.
- Single-nucleotide polymorphism (SNP): (pronounced snip) A SNP test confirms your haplogroup by determining if a SNP has mutated from its derived or ancestral state. A SNP is usually found on a different area of the Y-chromosome than where the Y-STR markers are. Sometimes, a SNP may cause a null result on a marker.
- Short tandem repeat: Patterns in the DNA sequence which repeat over and over again in tandem, i.e., right after each other. Typically the repeat motif is less than six base pairs long. By counting the repeats, one gets an allele value which is given in an individual's haplotype. STRs are also known as microsatellites and simple sequence repeats (SSRs).
- Singleton: A genetic mutation (normally a single nucleotide variant; SNV) in the Y chromosome that is currently unique to the tester and his immediate haplogroup. Singletons occur on the descendant line of descent, below the most descendant shared subclade. There is no way to know the descendant order of these variants until somebody more closely related to the tester tests positive for one or more of them, thereby creating a new further descendant subclade, and reducing the number of remaining singletons of the respective testers.
- SNP testing
- SNP tsunami: The flood of SNPs discovered through next generation sequencing Y chromosome DNA tests
- Subclade: Referring to a "branch" farther down the phylogenetic tree. Example: H3 -> '3' is a sub-clade of mitochondrial DNA haplogroup 'H'. R1b -> '1b' is a sub-clade of Y-chromosome haplogroup 'R'. Subclade testing is also referred to as SNP testing or deep clade testing.
- Surname: A last name or family name traditionally passed down from father to son.
- Surname era
- Surname mapping: The process of plotting the distribution of a surname on map.
- Three-quarter sibling:
- Thymine: One of the four nucleotide bases in DNA; pairs with adenine. In RNA, thymine is replaced with uracil.
- TiP: Acronym for the Family Tree DNA Time Predictor tool - a proprietary program created by Family Tree DNA to calculate the time to the most recent common ancestor using mutation rates specific to each marker.
- Time to the Most Recent Common Ancestor (TMRCA): The amount of time or number of generations since individuals have shared a common ancestor. Since mutations occur at random, the estimate of the TMRCA is not an exact number (i.e., 7 generations), but rather a probability distribution. As more information is compared, the TMRCA estimate becomes more refined.
- Transcription: The first step in gene expression, in which a messenger RNA molecule complementary to a particular gene encoded in DNA is synthesized by enzymes called RNA polymerases. To produce a functional protein, transcription is followed by translation.
- Translation: The second step in gene expression, in which a messenger RNA molecule is read by the ribosome to produce a functional protein. Translation is always preceded by transcription.
- Transmission event: The passage of genetic material from one generation to the next.
- Triangulation: A method of determining the ancestral haplotype of an ancestor using the DNA results of direct line descendants.
- Unique-event polymorphism
- Uracil: One of the four nucleotide bases in RNA; pairs with adenine. In DNA, uracil is replaced with thymine.
- Variant Call Format (VCF): A plain-text file format (as distinguished from its binary version, file extension BCF) used to store genetic data that represent variations from a specified reference genome. This allows only the variants to be stored rather than all the allele calls across a genome. Read more about the format and its versions at the Samtools GitHub website.
- Visual phasing
- Walk Through the Y: A Y-SNP discovery programme initiated by Thomas Krahn.
- Western Atlantic Modal Haplotype (WAMH) The most common Y-DNA haplotypes found in Europe’s most common Y-DNA haplogroup, R1b.
- Whit's predictor: The commonly applied nickname to the Y-DNA Haplogroup Predictor created by Whit Athey. Enter Y-chromosome markers into the predictor and it will display percentages for matches to various haplogroups. There are now other haplogroup predictors available. See the list on the page for Y-DNA tools.
- X-chromosome: A sex chromosome. A female child receives one X-chromosome from her father and one X-chromosome from her mother. A male child receives an X-chromosome from his mother and a Y-chromosome from his father.
- Y Alu Polymorphism (YAP): An alu is a sequence of approximately 300 letters (base pairs) which has inserted itself into a particular region of the DNA. A Y alu polymorphism is an alu which has occurred on the Y-chromosome. There are about a million Alu insertions scattered throughout the human genome.
- Y chromosome: The male sex chromosome. Only males have a Y-chromosome, which they receive from their father, who received it from his father, and so on. This transmission of the Y-chromosome down the male line is why it is useful for surname testing to determine if two males share a common ancestor.
- Y-chromosomal Adam: Also known as Y-Adam. The common patrilineal ancestor of all living males.
- Y Chromosome Consortium (YCC)
- Ysearch: was a free public database sponsored by Family Tree DNA where Y-chromosome DNA results from any testing facility could be uploaded and compared. Offline since 2017. A new free public database is available at www.mitoYDNA.org
- Y-STR: Acronym for Y-chromosome Short Tandem Repeat. The number of times the bases repeat that determines the value of the marker. Example: Thirteen repeats of the same bases equals a value of '13'.
This glossary includes the content from the ISOGG DNA Newbie Glossary which was previously published on the main ISOGG website (for reference see the version preserved in the Internet Archive in August 2014). The ISOGG DNA Newbie Glossary was compiled on 17 May 2006, revised on 13 January 2008, and last updated on 7 December 2009. The content from the Newbie Glossary was incorporated into the ISOGG Wiki Glossary on 25 January 2015. The contributors to the original ISOGG DNA Newbie Glossary were: K. Borges, N. Custer, P. Goff, J. Hailman, E. Krause and C. Mello.
Links to other glossaries
- ISOGG Glossary for the Y-DNA Haplogroup Tree
- Family Tree DNA Glossary
- Charles Kerchner's Genetic Genealogy Glossary
- Genographic Project Glossary
- SMGF Glossary (Internet Archive)
- Talking Glossary of Genetic Terms from the US National Human Genome Research Institute
- Evolution glossary
- Glossary from the Finnish Centre of Scientific Computing (in Finnish)