Primary vs. Secondary Sources in Genealogy Research
Knowing whether a record is a primary or secondary source is one of the foundational judgment calls in genealogy — the kind of thing that separates a well-reasoned conclusion from a family tree that quietly spreads someone else's mistake. This page explains how genealogists define these source types, how they interact with the evidence found inside documents, and where the distinction becomes genuinely difficult to apply. It draws on the framework developed by the Board for Certification of Genealogists and articulated in Genealogical Standards (2nd ed., 2019).
Definition and scope
A primary source is a record created at or near the time of an event, by someone with firsthand knowledge of that event. A secondary source is a record created after the fact, or by someone who learned about the event from another person rather than by direct observation.
The distinction hinges on two variables simultaneously: time and informant knowledge. A record created decades after a birth, by a family member recalling what they were told, is secondary on both counts. A record created within days of a birth, by an attending physician, is primary on both counts. That clean, two-axis structure is what makes the framework so useful — and so easy to misapply when a single document straddles both categories.
One clarification worth making early: the primary/secondary label attaches to the source as a whole, while the quality of the information inside that source is evaluated separately. The Genealogical Proof Standard treats these as related but distinct concerns. A single death certificate can simultaneously be a primary source for the date of death (reported by the attending physician, present at the event) and a secondary source for the deceased's birthplace (reported by a grieving spouse who was told, years earlier, that her husband came from "somewhere in County Cork").
How it works
Genealogists trained in the Board for Certification of Genealogists framework apply a three-layer analysis to every document they encounter:
- Source type — Original vs. derivative (was this the first recording, or a later copy/transcription?)
- Information type — Primary vs. secondary (did the informant witness the event firsthand?)
- Evidence type — Direct, indirect, or negative (does the information explicitly answer the research question, or only imply an answer?)
The primary/secondary distinction lives entirely in layer two. It does not determine whether a record is useful — it determines how much independent corroboration that piece of information requires before it can anchor a conclusion.
Primary information carries more inherent weight because the informant had direct observational access to the event. Secondary information is not unreliable by definition; it simply requires more scrutiny. A published county history from 1890 may accurately record a settler's 1847 arrival date because the author interviewed the settler directly — but that researcher cannot know, from the book alone, whether that interview ever happened.
The citing genealogical sources discipline exists precisely because source type must be recorded and transmitted, not assumed. A finding with no documented source type is essentially unverifiable.
Common scenarios
The primary/secondary distinction plays out differently across record types. Here are four representative cases where the line requires active judgment:
Death certificates. The cause of death, certified by a physician, is primary information. The decedent's parents' names, provided by a surviving child who may have been guessing, are secondary information. These two pieces of information sit in the same box on the same form.
US census records. US Census records are almost always secondary for birth years and birthplaces. The enumerator recorded what one household member stated, often the head of household speaking for children, elderly relatives, and boarders. Ages in consecutive censuses for the same individual frequently differ by 3 to 5 years — exactly what secondary reporting predicts.
Vital records. Vital records vary enormously by jurisdiction and era. A birth certificate filed within 10 days of delivery, signed by an attending midwife, contains primary information about the birth date. A delayed birth registration filed 40 years later — a common scenario for African Americans in the rural South before mandatory registration — is secondary information, however official its appearance. The document is real; the informant's knowledge is secondhand.
Published family histories. These are almost always secondary sources for all events they narrate. The writing a family history process involves synthesizing sources, and each synthesis step introduces the possibility of transcription error, misinterpretation, or uncritical acceptance of a prior researcher's mistake. A published genealogy that cites original sources for each claim effectively upgrades those claims; one without citations offers secondary information with no traceable informant.
Decision boundaries
Two situations reliably create confusion.
The same document can contain both primary and secondary information. This is not a paradox — it is the normal condition of most genealogical documents. Evaluating a record requires evaluating each piece of information within it on its own terms. Treating a document as uniformly primary because it was created near the time of an event overlooks the informant problem entirely.
Derivative sources are not automatically secondary. An accurately transcribed copy of an original register is derivative (it is not the first recording) but may still contain primary information if the original informant had firsthand knowledge. The national archives genealogy collections include millions of derivative records — microfilmed originals, typed transcriptions, digital scans — that preserve primary-quality information. Conversely, an original handwritten letter written 30 years after the event contains secondary information even though the physical object is an original.
The practical rule: always ask who told the recorder this, and were they present when it happened? That question, applied consistently, cuts through most edge cases faster than any decision tree. For researchers building out a full methodology, genealogy research methods and resolving genealogical conflicts both extend this framework into evidence evaluation and conflict resolution. The broader architecture of the discipline is outlined at the genealogy authority home.