How Genealogy Works (Conceptual Overview)
Genealogical research is the systematic process of establishing, documenting, and verifying family relationships through a structured analysis of historical records, biological evidence, and corroborating sources. This page covers the operational mechanics of that process — how evidence is collected and evaluated, which actors operate within the professional landscape, and what structural factors determine whether a research conclusion meets the standards required for legal, medical, or institutional use. The subject matters beyond personal interest: genealogical conclusions carry weight in probate proceedings, citizenship applications, and clinical genetics contexts that affect living individuals.
- The Mechanism
- How the Process Operates
- Inputs and Outputs
- Decision Points
- Key Actors and Roles
- What Controls the Outcome
- Typical Sequence
- Points of Variation
The Mechanism
Genealogy functions as an evidence-correlation discipline. Its central mechanism is the construction of a logical argument that links a living or historical individual to a specific family network, using independently gathered documentary and biological sources that, taken together, satisfy a defined proof threshold.
The operative quality threshold in professional practice is the Genealogical Proof Standard (GPS), as codified by the Board for Certification of Genealogists (BCG). The GPS requires five elements: a reasonably exhaustive search of relevant sources, complete and accurate citations, analysis of each source and the information it contains, resolution of any conflicting evidence, and a soundly reasoned written conclusion. Findings that do not satisfy all five elements are not considered proven under professional standards and carry risk of rejection when submitted to hereditary societies, courts, or official registries.
The mechanism is adversarial by design. Because historical records contain errors, omissions, and deliberate falsifications, genealogical conclusions are held to a standard that requires researchers to actively seek contradicting evidence rather than simply accumulate supporting data. A single unresolved conflict can invalidate an otherwise well-documented conclusion.
How the Process Operates
Genealogical research begins with known facts and moves backward through time — a methodology sometimes called the "reasonably exhaustive search" requirement embedded in the GPS. A researcher starts with a living or recently documented individual and identifies each biological or legal relationship that connects that person to earlier generations.
Each relationship requires independent documentary corroboration. A parent-child relationship, for example, may be established through a birth certificate, a census record listing household members, a baptismal register, and DNA match data. No single document is treated as conclusive; the strength of a conclusion derives from the convergence of independent sources.
Source materials are classified along a three-axis framework before analysis. The source axis distinguishes original sources (the physical or digital artifact) from derivative sources (transcriptions, abstracts, indexes) and authored narratives (compiled accounts). The information axis distinguishes primary information (provided by a witness with firsthand knowledge) from secondary information (provided by someone without it). The evidence axis classifies extracted findings as direct (explicitly answers the research question), indirect (partially relevant), or negative (the absence of an expected record). A single document can contain primary information on one point and secondary information on another simultaneously.
Inputs and Outputs
Inputs into the genealogical process fall into three broad categories:
- Documentary records — civil registration documents (birth, marriage, death certificates), census schedules, probate filings, military service and pension records, immigration manifests, naturalization files, land patents, church registers, and institutional records such as hospital admission logs or school enrollment rolls. The National Archives and Records Administration (NARA) maintains genealogy-specific research portals covering hundreds of millions of records across these categories.
- Derivative and compiled sources — published genealogies, indexed databases (Ancestry, FamilySearch, Findmypast), digitized newspaper archives, and local history volumes. These accelerate access but require verification against originals because transcription errors accumulate.
- Biological evidence — autosomal DNA, Y-chromosome DNA (Y-DNA), and mitochondrial DNA (mtDNA). Autosomal DNA identifies relatives across all family lines within approximately 5 to 7 generations. Y-DNA traces the direct patrilineal line, relevant to surname studies and migration research. mtDNA traces the direct matrilineal line. These three methodologies answer different research questions and cannot substitute for one another.
Outputs range in formality from informal family tree diagrams to legally tendered proof summaries. Formal outputs include:
- A genealogical proof summary or proof argument — a written document that presents all evidence, addresses conflicts, and states a conclusion in terms of the GPS
- A lineage report formatted to the requirements of a hereditary society such as the Daughters of the American Revolution or the Sons of the American Revolution
- A kinship determination submitted to a probate court or immigration authority
- A genetic genealogy report integrating DNA match analysis with documentary findings
Decision Points
The research process involves structured decision points at which the direction of inquiry shifts based on the quality of available evidence.
| Decision Point | Triggering Condition | Possible Paths |
|---|---|---|
| Source adequacy assessment | Record located; content evaluated | Accept as primary input, flag as derivative requiring verification, or reject as insufficient |
| Identity confirmation | Candidate record found for target individual | Corroborate with 2+ independent sources, or suspend pending further search |
| Conflicting evidence resolution | Two sources contradict each other | Weigh source and information quality, seek third source, or document the conflict explicitly |
| DNA evidence integration | Biological matches identified | Map matches against documentary tree; identify whether match is consistent or contradictory |
| Proof conclusion | Sufficient evidence assembled | Draft written conclusion per GPS, or identify remaining gaps and extend search |
| Research halt | Record loss, destruction, or inaccessibility | Document the gap, assess alternative record classes, or note the limit of current knowledge |
The resolution of conflicting evidence is the most consequential decision point. Researchers who suppress contradictory evidence rather than resolve it produce conclusions that fail the GPS and may produce legally or medically consequential errors.
Key Actors and Roles
The genealogical service sector involves distinct professional categories with different credentialing pathways and operational scopes.
Certified Genealogists (CG) hold credentials awarded by the Board for Certification of Genealogists. BCG credentialing emphasizes proof standards, written analysis quality, and the application of the GPS across a submitted portfolio of work. The CG designation does not carry geographic or record-type restrictions.
Accredited Genealogists (AG) hold credentials awarded by the International Commission for the Accreditation of Professional Genealogists (ICAPGen). ICAPGen accreditation is region-specific and record-type-specific; an AG credential is awarded for a defined geographic region or record corpus, not for general competency. The two credentials are not interchangeable and address different professional functions.
Genetic genealogists are practitioners specializing in DNA evidence analysis. This subspecialty may or may not overlap with documentary genealogy. Practitioners with formal credentials in both areas are equipped to integrate the two evidence streams; those without documentary training may produce genetic-only analyses that lack the corroborating paper trail required for legal or institutional acceptance.
Institutional custodians — NARA, state vital records offices, county clerks, probate courts, and church archives — are not genealogists but control access to primary sources. Their access policies, digitization timelines, and indexing completeness directly determine what evidence is available to researchers.
DNA testing laboratories — including 23andMe, AncestryDNA, and FamilyTreeDNA — provide raw DNA data and relative-matching databases. These entities operate as evidence generators, not genealogical analysts. Interpretation of match data requires separate professional expertise.
What Controls the Outcome
Four structural factors govern the quality and completeness of genealogical conclusions.
Record survival and accessibility is the most significant limiting factor. Fires, floods, institutional neglect, and deliberate destruction have eliminated primary source material for large portions of the U.S. population. The 1890 U.S. federal census was almost entirely destroyed in a 1921 fire, creating a documented evidentiary gap for that decade. Researchers working with African American lineages prior to emancipation face systematic record gaps because enslaved individuals were documented as property rather than persons in most antebellum records.
Researcher methodology determines whether the GPS threshold is met. Shortcuts in citation practice, failure to search conflicting sources, or over-reliance on derivative databases rather than originals are the primary sources of professional error.
Naming conventions and orthographic variation introduce identity ambiguity. Surnames were frequently anglicized at immigration, spelled phonetically by clerks, or altered voluntarily. A research conclusion identifying two records as referring to the same individual requires explicit argumentation, not assumption based on name similarity.
DNA database size and family tree quality constrain what biological evidence can establish. A DNA match is meaningful only if the matching individual's documented tree is accurate; errors in the matching party's tree propagate into the researcher's analysis.
Typical Sequence
The following sequence describes the structural stages of a professional genealogical research project:
- Define the research question — state the specific relationship or identity to be established in precise terms
- Inventory existing documentation — catalog all records already in hand and assess their source, information, and evidence classifications
- Identify record classes likely to contain relevant evidence — map the time period, geography, and record-keeping institutions relevant to the target individual
- Execute the search — access original or derivative records in order of evidential priority, prioritizing original sources over compiled indexes
- Evaluate each record — apply the three-axis source-information-evidence framework before drawing any conclusions
- Identify and resolve conflicts — document contradictions and pursue third or fourth sources sufficient to resolve them
- Integrate DNA evidence — where relevant, map biological matches against documentary findings and assess consistency
- Draft the written conclusion — produce a proof summary, proof argument, or research report that meets GPS requirements
- Document gaps — record what was searched, what was not found, and why the research boundary was set where it was
The Genealogy Frequently Asked Questions reference addresses specific procedural questions that arise at stages 3 through 7.
Points of Variation
Genealogical research varies substantially across population groups, time periods, and geographic regions.
Pre-civil registration periods (before approximately 1880 in most U.S. states) rely on church registers, tax lists, and land records because no standardized vital records system existed. Research in this period requires familiarity with county formation histories and the specific record-keeping practices of dominant religious denominations in the target region.
Immigrant research introduces the complication of dual-jurisdiction documentation. A person immigrating to the United States before 1900 may appear in both U.S. federal records and the civil or church registers of the origin country. Research may need to extend into foreign-language archives or records held by consular offices.
African American genealogy prior to 1870 requires specialized methodology. Freedmen's Bureau records, slaveholder estate inventories, plantation records, and DNA evidence collectively form the primary evidential toolkit for this research context, because standard civil registration records do not document enslaved individuals by name in most pre-emancipation sources.
DNA-only research — cases in which documentary records are entirely absent — relies on triangulating matches across tested relatives to reconstruct probable family structures. This methodology produces probabilistic conclusions rather than documented proof and carries inherent uncertainty that must be disclosed in any formal output.
The breadth of the field and its intersection with law, medicine, immigration, and identity documentation is reflected in the scope of reference material available through Genealogy Authority. Researchers and professionals navigating specific record systems, credentialing pathways, or evidentiary questions will find the distinctions between these variation categories operationally significant rather than merely academic.