DNA Testing for Genealogy: Types, Uses, and What to Expect

DNA testing has transformed genealogy from an exercise in paperwork into something more like forensic science — one where a saliva sample can surface a cousin whose existence was previously unknown, or confirm that the family story about distant Cherokee ancestry is exactly that: a story. This page covers the three major test types used in genetic genealogy, how each one works mechanically, what results actually mean, and where the technology reliably delivers versus where it falls short.


Definition and scope

Genetic genealogy is the application of DNA analysis to questions of biological descent, family relationship, and population origin. It operates alongside — not instead of — documentary research. A DNA match cannot name a great-great-grandmother; it can establish that a probable biological relationship exists and narrow the search to a specific family cluster, which documentary records then identify.

Three distinct test categories serve genealogical purposes: autosomal DNA (atDNA), Y-chromosome DNA (Y-DNA), and mitochondrial DNA (mtDNA). Each interrogates a different portion of the genome, follows a different inheritance path, and answers a different category of question. Conflating them is one of the most reliable ways to reach a wrong conclusion.

The genealogical DNA testing market is anchored by a handful of major consumer databases. AncestryDNA reported more than 22 million customers in its database as of 2022, making it the largest consumer genetic genealogy repository in the world. 23andMe, MyHeritage DNA, FamilyTreeDNA, and the free upload database GEDmatch round out the primary platforms. Database size matters because match quality is a direct function of how many people have tested — a point elaborated in the tradeoffs section below.

For a broader orientation to the discipline, the genealogyauthority.com homepage situates DNA testing within the full spectrum of genealogical research methods.


Core mechanics or structure

Autosomal DNA occupies chromosomes 1 through 22 (the autosomes) plus the X chromosome, and it recombines with each generation — meaning a child inherits roughly 50% of each parent's autosomal DNA, roughly 25% of each grandparent's, and so on. The practical ceiling for reliable autosomal matching is approximately 5 to 7 generations, beyond which shared segments become too small for confident attribution. The unit of measurement is centimorgans (cM): a full sibling typically shares 2,300–3,900 cM (International Society of Genetic Genealogy Wiki, cM relationship table), while a second cousin shares roughly 230 cM on average, with meaningful variance on either side.

Y-chromosome DNA is passed from father to son with minimal recombination, making it an extraordinarily stable marker of the direct paternal line. Two men who share a Y-DNA haplotype share a common patrilineal ancestor — potentially within a genealogically useful timeframe or potentially thousands of years ago. The test format matters: a 37-marker Short Tandem Repeat (STR) test is suitable for identifying probable surname matches; a Big Y-700 or equivalent next-generation sequencing test resolves Single Nucleotide Polymorphisms (SNPs) and places a tester precisely on the phylogenetic Y-chromosome tree maintained by YFull and ISOGG.

Mitochondrial DNA follows the direct maternal line — mother to all children, but only daughters pass it on. Because mtDNA mutates slowly, an exact mtDNA match between two people means they share a common maternal-line ancestor, but that ancestor could have lived 500 years ago or 5,000 years ago. Full sequence mtDNA (the full mitochondrial genome, ~16,569 base pairs) is the only format with meaningful genealogical resolution.

The autosomal DNA for genealogy, Y-DNA, and mitochondrial DNA pages on this site each provide detailed technical treatment of the respective test types.


Causal relationships or drivers

The matching algorithms used by consumer platforms compare a tester's genome against every other person in the database and flag segments that appear identical by descent (IBD) — meaning inherited from a shared ancestor rather than merely coincidentally similar (identical by state, or IBS). Longer segments are more likely to be IBD; very short segments, typically below 7 cM, have a meaningfully elevated false-positive rate.

Endogamy — the practice of marrying within a defined community — inflates match counts and cM totals in ways that can make relationships appear closer than they are. Ashkenazi Jewish, Amish, Polynesian, and many colonial American lines are classic examples. Researchers working within endogamous populations need to adjust their relationship estimates downward and apply tools like the DNA Painter shared cM tool, which integrates data from Blaine Bettinger's Shared cM Project.

Population reference panels determine ethnicity estimates. Each platform maintains its own reference panel — a curated set of individuals with well-documented ancestry from specific regions — against which a customer's DNA is compared using statistical algorithms. Panel composition varies significantly across companies, which is the primary reason two platforms can produce different ethnicity percentages from identical DNA.


Classification boundaries

The three test types are not interchangeable, and no single test covers all genealogical needs:

X-DNA inheritance follows a unique pattern — neither strictly autosomal nor Y-linked — and can be a useful supplementary tool for ruling out certain ancestral lines, since fathers pass X-DNA only to daughters, not sons. The ISOGG X-DNA inheritance chart is the standard reference for mapping which ancestors could have contributed X-DNA to a given tester.

Raw DNA data files (typically in .txt or .csv formats) can be downloaded from most platforms and uploaded to secondary databases, particularly GEDmatch and FamilyTreeDNA's Family Finder, to expand match pools at no additional testing cost.


Tradeoffs and tensions

The central tension in genetic genealogy is precision versus population coverage. FamilyTreeDNA is the only major consumer platform that offers dedicated Y-DNA and mtDNA products, and its autosomal database is smaller than AncestryDNA's by an order of magnitude. Choosing a platform for Y-DNA depth means accepting a smaller autosomal match pool; chasing the largest autosomal database means using AncestryDNA, which does not offer chromosome browser tools — a significant handicap for advanced segment analysis.

Privacy is the other structural tension. Consumer DNA databases are subject to law enforcement access in the United States under specific legal circumstances. GEDmatch drew public attention in 2018 when it was used to identify the Golden State Killer suspect through investigative genetic genealogy — a technique that sparked both federal guidance and ongoing legislative debate. GEDmatch subsequently introduced opt-in and opt-out settings for law enforcement matching. The U.S. Department of Justice issued interim policy guidance on forensic genealogical DNA in 2019.

Health data and genealogy data increasingly overlap. Platforms that offer both ancestry and health reports, such as 23andMe, create compound privacy considerations that are distinct from platforms focused exclusively on genealogy.


Common misconceptions

Ethnicity estimates are not ancestry proof. They are statistical probabilities derived from reference panels that are imperfect, geographically clustered, and periodically revised. A 12% "Irish" estimate does not confirm an Irish great-grandparent; it means the algorithm found statistical similarity to the platform's Irish reference population at that percentage. For research into DNA ethnicity estimates, the methodology warrants its own dedicated examination.

A DNA match is not a proven relationship. It establishes a probable biological connection. The genealogical work of determining which specific shared ancestor produced the match requires documentary research — census records, vital records, and the full toolkit described in genealogy research methods.

Y-DNA tests the surname line, but surnames change. Adoption, name changes at immigration, and non-paternity events mean a man's Y-DNA haplogroup may not correspond to the surname he carries. Non-paternity events — where a child's biological father differs from the recorded father — are estimated to occur in roughly 1–3% of births per generation (American Journal of Human Genetics, cited via ISOGG wiki), which compounds across multiple generations.

mtDNA is not a tool for recent genealogy in most cases. Its mutation rate is slow enough that an exact full-sequence match might represent a shared ancestor within 5 generations or within 50. Without a paper trail connecting two matrilineal lines, an mtDNA match alone rarely resolves a genealogical question.


Checklist or steps (non-advisory)

The following sequence describes how a genetic genealogy investigation typically proceeds when the goal is identifying an unknown or unconfirmed relationship:

  1. Test selection — Determine which lineage question is being asked, then match the test type to the question (autosomal for broad cousin matching, Y-DNA for paternal line, mtDNA for maternal line).
  2. Platform selection — Choose based on database size for the relevant test type; consider uploading raw data to GEDmatch for additional matching.
  3. Match list review — Sort matches by shared cM (highest first); note any matches above 200 cM as priority contacts.
  4. Cluster analysis — Group matches who also share DNA with each other into family clusters using tools such as the AutoClustering tool at Genetic Affairs or the Leeds Method.
  5. Tree building for matches — Build or review the family trees of top matches to identify surnames and geographic areas in common.
  6. Triangulation — Identify 3 or more matches who share an overlapping DNA segment on the same chromosome, pointing to a single ancestral source.
  7. Documentary confirmation — Use records — census, vital, probate, immigration — to confirm or refute the hypothesized connection. The genealogical proof standard governs what constitutes a proven conclusion.
  8. Documentation — Record all findings, match names (with consent considerations in mind), cM values, and segment data for future reference.

Reference table or matrix

Test Type Chromosome(s) Inheritance Path Useful Generational Depth Primary Genealogical Use Tester Requirements
Autosomal (atDNA) 1–22, X All lines, recombines ~5–7 generations Cousin matching, relationship confirmation Any biological sex
Y-DNA STR (37–111 markers) Y Strict paternal ~5–8 generations (surname era) Surname group placement Male only (or male proxy)
Y-DNA SNP (Big Y-700) Y Strict paternal Centuries to millennia Haplogroup branch placement Male only (or male proxy)
Mitochondrial DNA (full sequence) Mitochondrial genome Strict maternal Genealogically uncertain; centuries to millennia Maternal haplogroup; deep maternal ancestry Any biological sex
X-DNA X chromosome Sex-dependent inheritance pattern ~5–7 generations Supplementary: ruling in/out specific ancestral lines Any biological sex

For researchers pursuing unknown parentage research or adoptee genealogy, genetic genealogy tools — particularly autosomal DNA combined with systematic clustering — have become the primary investigative method when documentary records are absent or sealed.


References