Epidemiology

1

Epidemiology in Sum

1. Epidemiology in Sum

🧭 Overview

🧠 One-sentence thesis

Epidemiology is the foundational science of public health that studies how many people get sick, how they got sick, and why they got sick—and applies this knowledge to control disease and improve population health.

📌 Key points (3–5)

  • Core definition: Epidemiology studies the distribution (who/where/when) and determinants (why/how) of health-related states and applies findings to control health problems.
  • Prevention levels: Three types exist—primary (prevent disease), secondary (screen for early disease), and tertiary (treat to minimize long-term effects).
  • Multicausality: No single cause explains any disease; multiple factors (agent, host, environment) interact to produce health outcomes.
  • Common confusion: Distribution vs. determinants—distribution describes patterns (who got sick, where, when), while determinants explain causes (why and how they got sick).
  • Social determinants matter: Health outcomes are shaped by conditions where people live, work, and play—not just biology or behavior.

🔬 What epidemiology studies

📊 Distribution of disease

Distribution: the frequency of disease occurrence, which may vary from one population group to another.

  • Answers: who, what, where, when
  • Focuses on patterns and counts
  • Example: Top 10 causes of death vary by age group—unintentional injury ranks first for ages 1-44 but eighth for ages 65+
  • This is called descriptive epidemiology

🔍 Determinants of disease

Determinants: factors capable of bringing about change in health.

  • Answers: why and how
  • Includes chemical, biological, radiological, explosive factors, environment, stress, and social determinants
  • Examples: infectious agents, environmental hazards, behaviors, access to healthcare
  • This is called analytic epidemiology

🎯 Application and control

  • Application: uses specific measures and biostatistics to identify and solve problems
  • Control: has four aims—describe health status, explain disease etiology, predict occurrence, control occurrence
  • Prevention is the ultimate goal

🛡️ Prevention framework

🥇 Primary prevention

Primary prevention: Prevent disease before it occurs.

  • Target: susceptible individuals (those who can get the disease)
  • Methods: vaccination, behavior change, risk reduction messaging
  • Example: encouraging mask-wearing and vaccination for COVID-19
  • Example: promoting physical activity to prevent diabetes in healthy patients
  • Don't confuse with: primordial prevention (creating healthy environments like green spaces before risk factors develop)

🔬 Secondary prevention

Secondary prevention: Screen for disease in the subclinical stage.

  • Target: exposed individuals before symptoms appear
  • Goal: find disease early, during lead time (before clinical stage)
  • Methods: annual physical exams, blood work, screening tests (e.g., visual field tests)
  • Example: screening for prediabetes to prevent Type II diabetes
  • Key concept: many diseases are clinically inapparent initially—the "iceberg" of disease shows most cases hidden below the surface

🏥 Tertiary prevention

Tertiary prevention: Treat disease to minimize long-term effects.

  • Target: patients with clinically apparent disease
  • Goal: prevent disability and death
  • Methods: surgery, rehabilitation, ongoing disease management
  • Example: cataract surgery to improve vision and daily functioning

⏱️ Natural history timeline

The disease progression follows: susceptibility → exposure → subclinical disease (pathologic changes) → onset of symptoms → clinical disease (diagnosis) → recovery/disability/death.

🧩 Causality and disease mechanisms

🔺 The epidemiologic triangle

Three components interact to produce disease:

ComponentDescriptionExamples
AgentThing that causes disease/injuryViruses, chemicals, radiation, environmental factors
HostPerson/creature that can get diseaseInfluenced by immunity, genetics, anatomy, behavior, medications
EnvironmentExtrinsic factors affecting host and agentAir quality, sanitation, water, drainage, access to healthcare
  • Different diseases require different balances of these three
  • Just having an agent present is not sufficient—must also consider pathogenicity (ability to cause disease) and dose

🥧 Rothman's Pie Model (multicausality)

  • Each completed pie = one case of disease (sufficient cause)
  • Each pie slice = one contributing factor (component cause)
  • A slice in every pie = necessary cause (must be present for disease to occur)
  • Example: COVID-19 requires virus contact (necessary cause), but also sufficient exposure time, susceptibility, health status, vaccination status, age, occupation

Don't confuse: The same disease in two people may have different combinations of component causes—there's no single path to any health outcome.

📏 Bradford Hill Criteria

Nine considerations for evaluating causality (not requirements, but strong suggestions):

  1. Strength: How strong is the association?
  2. Consistency: Repeatable across researchers, populations, times?
  3. Specificity: Stronger in one group vs. another?
  4. Temporality: Does cause precede effect?
  5. Biological gradient: Dose-response relationship?
  6. Plausibility: Probable based on current knowledge?
  7. Coherence: Does it make sense?
  8. Experiment: Can experiments show cause leads to effect?
  9. Analogy: Similar situations with established relationships?

Important: One study never proves causality—requires a "mountain of evidence" across study types and populations.

🎯 Risk factors

To be a risk factor:

  • Exposure must precede disease onset
  • Disease frequency must vary by exposure level
  • Association must not be due to error

🌍 Population health perspectives

👥 Person, place, and time

Descriptive epidemiology examines three overlapping factors:

Person: characteristics of affected individuals

  • Age, sex, gender, race, ethnicity, education, behaviors, occupation, housing status
  • Helps identify who is affected and what they have in common

Place: geographic and environmental characteristics

  • Where people live, got sick, sought care
  • Climate, facilities, rural vs. urban, zip code
  • Example: heat map showing MLB players' states of residence when specializing in baseball

Time: temporal patterns

  • When outcomes occurred
  • Hour, day, week, season, before/after events, simultaneous occurrences
  • Example: concussion rates before vs. after Ohio's 2013 concussion law showed relative increase in sports-related cases and decrease in non-sports cases

🏘️ Social determinants of health (SDOH)

Social determinants of health: conditions in environments where people are born, live, learn, work, play, worship, and age that affect health outcomes.

Six domains:

DomainExamples
Economic stabilityEmployment, income, expenses, debt, medical bills
Neighborhood/physical environmentHousing, transportation, safety, parks, walkability, zip code
EducationLiteracy, language, early childhood education, vocational training
FoodHunger, access to healthy options
Community/social contextSocial integration, support systems, discrimination, stress
Healthcare systemCoverage, provider availability, cultural competency, quality

⚖️ Health equity vs. equality

  • Equality: giving everyone the same resources (e.g., same bike for everyone)
  • Equity: giving everyone resources that work for them (e.g., adapted bikes for different needs)
  • Health disparities: differences in outcomes tied to race, ethnicity, sex, gender, age, disability, socioeconomic status, geography—not simple differences, but systematic inequities

🧊 The injury iceberg

Most disease factors are hidden below the surface:

Visible (clinically apparent):

  • Biological, psychological, behavioral factors
  • Individual level (intrapersonal)
  • Easily diagnosed

Hidden (clinically inapparent):

  • Interpersonal: family, peer relationships
  • Organizational: work, school, clubs
  • Community: utilities, roads, social capital
  • Society: infrastructure, economics, policy
  • These are latent failures vs. active failures

🔧 Prevention tools and frameworks

🔄 Van Mechelen's four-step sequence

A cyclical process for injury prevention:

  1. Establish extent: measure incidence and severity
  2. Establish etiology: identify causes and mechanisms
  3. Introduce prevention: implement interventions
  4. Assess effectiveness: evaluate and repeat Step 1

Example for volleyball: Step 1 measured injury rates per 1000 player hours; Step 2 found matches have 2.3× higher risk than training; Step 3 introduced supervised resistance training; Step 4 showed intervention group dropped from 5.3 to 0 injuries per 1000 hours.

📊 Haddon's Matrix

Organizes prevention by phase and factor:

PhaseFocus
Pre-injury (primary)Prevent event from occurring
Injury (secondary)Reduce severity during event
Post-injury (tertiary)Minimize consequences after event

Applied across: host (athlete), agent (equipment), physical environment (field conditions), social/economic environment (rules, costs, enforcement).

Example for baseball TBI: Pre-injury includes helmet design and athlete education; injury phase includes protective equipment and supervision; post-injury includes return-to-play compliance and access to trauma centers.

🔍 Proximal, medial, and distal causes

Component causes exist at different levels:

  • Proximal (downstream): immediate cause—e.g., lack of physical activity
  • Medial (midstream): cause of the cause—e.g., working three jobs
  • Distal (upstream): root cause—e.g., economic inequality, lack of living wage

Effective prevention targets distal causes through primordial and primary prevention.

📚 Uses and subfields

🎯 Two major uses

  1. Describe population health status and services

    • Health services research, policy, health promotion, history
    • Identify at-risk populations, note trends, diagnose community health
  2. Determine disease etiology

    • Biology, ecology, genetics, laboratory sciences
    • Understand causes, conditions, syndromes

🔬 Subspecialty areas

The excerpt lists 30+ subfields including:

  • Infectious disease, chronic disease, cancer, cardiovascular
  • Injury, sports/recreation, occupational
  • Pharmacoepidemiology, genetic, molecular
  • Social, environmental, global health
  • Clinical, field, veterinary epidemiology

Each has specific methods for helping populations; subject-matter experts available for nearly any health problem.

2

Measuring Things in Epidemiology

2. Measuring Things in Epidemiology

🧭 Overview

🧠 One-sentence thesis

Epidemiologists measure health issues through two main categories—counts and ratios—and must carefully choose between prevalence (all cases) and incidence (new cases) to understand disease burden, speed of spread, and the impact of interventions.

📌 Key points (3–5)

  • Two fundamental measurement types: counts (simple tallies) and ratios (one quantity divided by another, including rates, proportions, and percentages).
  • Prevalence vs. incidence: prevalence captures all existing cases at a point or period in time; incidence captures only new cases and requires tracking time at risk.
  • Common confusion: prevalence can increase even when incidence falls (e.g., better treatment extends survival), and incidence can rise due to better detection rather than true disease spread.
  • Why denominators matter: proportions require the numerator to be part of the denominator; rates must include time; choosing the right "at-risk" population is essential for accuracy.
  • Dynamic vs. fixed populations: fixed populations (closed cohorts) use cumulative incidence; dynamic populations (open, changing) use incidence rates with person-time or average population denominators.

📊 Core measurement categories

📊 Counts

Counts: the simplest quantitative measures in epidemiology, referring to the number of cases of a disease or health phenomenon being studied.

  • Counts are raw numbers without context.
  • Example: 2,920,260 men died of injury in the U.S. between 1981 and 2007.
  • Useful for absolute magnitude but hard to interpret without comparison.

📊 Ratios

Ratios: values obtained by dividing one quantity (count) by another (numerator over denominator).

  • Most epidemiologic numbers are ratios: rates, proportions, percentages.
  • Example: The sex ratio of injury deaths was 2.6:1 men to women, meaning "for every 2.6 injury deaths among men, there was one among women."
  • Ratios allow comparison across groups and time periods.

🔢 Proportions and magnitude

🔢 What proportions measure

Proportions: a measure that states a count relative to the size of the group; a ratio in which the denominator contains the numerator.

  • Can be expressed as a percentage.
  • Used to demonstrate the magnitude of a health problem.
  • Example: If 10 students in a 20-student dorm have strep throat, 50% are ill—a major problem requiring immediate action. If 10 out of 500 students are ill, only 2% are affected—less urgent.

🏥 Prevalence as a proportion

Prevalence: the number of existing cases of a disease or health condition in a population at some designated time.

  • Point prevalence: prevalence at a specific point in time.
  • Period prevalence: prevalence over a specified period of time.
  • Numerator for period prevalence = cases at the start + new cases during the period.
  • Denominator = average population over the time period.
  • Example: 69% of elite Swedish ice hockey goalkeepers reported a hip or groin injury at any point in the season—a large burden indicating need for prevention.

🧮 Calculating period prevalence

  • If 25 field hockey players start the year with shoulder injuries and 248 new injuries occur, with population changing from 1000 to 1200 players:
    • Numerator = 25 + 248 = 273
    • Denominator = (1000 + 1200) / 2 = 1100
    • Period prevalence = 273 / 1100 = 0.2481 or 24.81%
  • Don't confuse: this is an absolute number (distance from zero) made relative by expressing as a percentage for context.

⏱️ Rates and the role of time

⏱️ What rates require

Rate: a ratio that consists of a numerator and denominator and in which time forms part of the denominator.

  • Must contain: disease frequency, unit size of population, and time period during which an event occurs.
  • Often reported with multipliers (e.g., per 100,000 population or per 1,000 live births) to standardize comparisons.

🔍 Three types of rates

Rate typeDefinitionWhen to use
Crude rateRawest version; simple numerator and denominator with no adjustmentQuick snapshot; use with caution when comparing populations
Adjusted rateStatistical procedures remove effects of population composition differences (e.g., age, sex)Comparing "apples to apples" across populations
Specific rateRate for a particular subgroup (e.g., age-specific, race-specific, cause-specific)Examining a defined subset of the population
  • Example: NCAA student-athlete deaths from all causes = 514 deaths / 450,000 athletes = 1.14 deaths per 1,000 student-athletes.
  • Don't confuse crude rates with true variation: observed differences may result from systematic factors (e.g., age distribution) rather than real disease differences.

🆕 Incidence: measuring new cases

🆕 What incidence captures

Incidence: the number of new cases of a disease that occur in a group during a certain time period.

  • Used to research disease etiology (causes) and estimate risk of developing disease.
  • Requires: numerator (number of new cases), denominator (population at risk), and time period.
  • Example: 20 new shoulder injuries in January among 975 at-risk field hockey players = 20.51 injuries per 1,000 players in Metro A in January 2020.

🧪 Population at risk

  • Only include people who are capable of having the outcome.
  • Example: If 25 players already have shoulder injuries at the start of January, they cannot get a new injury that month, so the at-risk population = 1000 - 25 = 975.
  • This denominator refinement improves accuracy.

⚖️ Incidence vs. prevalence

⚖️ The bathtub analogy

  • Incidence = water flowing into the tub (new cases).
  • Prevalence = water in the tub at any moment (all cases).
  • If the drain is closed (no recovery or death), prevalence builds up.
  • If the drain is open (cure or death), some cases exit; recurrences can re-enter.

⚖️ Key relationships

ScenarioEffect on incidenceEffect on prevalence
Physical therapy shortens acute hip pain durationNo direct effectDown (shorter duration = fewer old cases)
Untreated rabies kills within 10 daysNo direct effectDown (short duration = low numerator)
Antiretroviral drugs extend HIV survivalNo direct effectUp (long duration = numerator grows)
Universal polio vaccinationDown (prevents new cases)Down (fewer total cases over time)
Improved COVID-19 test accuracyUp (detects more cases)Up (more cases identified)
  • Prevalence ≈ incidence × duration of illness.
  • Short duration + high incidence → prevalence similar to incidence (e.g., common cold).
  • Long duration + low incidence → prevalence grows faster than incidence (e.g., chronic diseases, HIV with treatment).

⚖️ Common confusion

  • Prevalence can increase even when incidence decreases if people live longer with the disease.
  • Incidence can increase due to better detection (more accurate tests) rather than true disease spread.
  • Always ask: "Is the change real, or is it due to measurement, treatment, or population changes?"

🧮 Calculating incidence in detail

🧮 Two general types

  1. Cumulative incidence: for fixed (closed) populations where everyone is tracked.
  2. Incidence rate (incidence density): for dynamic (open) populations where people come and go.

🧮 Cumulative incidence

Cumulative incidence: a proportion (ranges 0 to 1) used when you have a well-defined, closed population with complete or carefully tracked follow-up.

  • Formula (complete follow-up): new cases / initial study population
  • Example: 121 out of 1,000 ER patients return within a week = 0.121 or 12.1% cumulative incidence.
  • Also known as hazard of having an outcome; complement is cumulative survival (1 - cumulative incidence).

🧮 Handling incomplete follow-up

Classic Life Table (CLT)

  • Assumes censoring happens uniformly throughout the period.
  • Denominator adjustment: subtract half the censored cases.
  • Formula: new cases / (at risk - [events/2] - [censored/2])
  • Example: By month 6, 3 patients released, 4 censored → cumulative incidence = 3 / (10 - 1.5 - 2) = 3 / 6.5 ≈ 0.375 or 37.5%.

Kaplan-Meier (K-M) Method

  • Calculates incidence every time an event occurs, giving full credit for time contributed.
  • Produces a "stair-step" survival curve.
  • Conditional probability at each event time; cumulative survival = product of all conditional survival probabilities.
  • Example: At time point 7, cumulative survival = 0.181 or 18.1%; cumulative incidence = 1 - 0.181 = 0.819 or 81.9%.
  • Don't confuse: K-M assumes censoring is independent of survival; if censored patients differ (sicker or healthier), results will be biased.

🧮 Incidence rate for dynamic populations

Incidence rate (incidence density): a ratio (can range from zero to infinity) used for open or dynamic populations where people enter and exit.

  • Formula: new cases / person-time at risk (or average population)
  • Person-time: sum of time each person contributed to the study.
    • 5 people for 5 years = 25 person-years.
    • 25 people for 1 year = 25 person-years.
  • Assumption: participants with the event contributed half the relevant time period before their outcome.

🧮 Person-time example

  • 8 volleyball players observed over 4 months; 5 injuries occur.
  • Month 1: 7 uninjured (7 months) + 1 injured (0.5 months) = 7.5 person-months.
  • Month 2: 7 uninjured (7 months) = 7 person-months.
  • Month 3: 3 uninjured (3 months) + 4 injured (2 months) = 5 person-months.
  • Month 4: 3 uninjured (3 months) = 3 person-months.
  • Total person-time = 22.5 person-months.
  • Incidence rate = 5 injuries / 22.5 person-months = 0.22 injuries per person-month or 2.67 per person-year.

🧮 Athlete-exposure

  • In sports research, person-time is often called athlete-exposure: being present and on the roster counts as an exposure.
  • Example: Division III preseason had 115,725 athlete-exposures and 562 game injuries → (562 / 115,725) × 1,000 = 4.86 game injuries per 1,000 athlete-exposures.

🌍 Population and disease dynamics

🌍 Demographic transition

Demographic transition: the change in population makeup due to births, deaths, and migration as populations move from agrarian to postindustrial.

  • Pattern: high births and deaths → low births and deaths.
  • Driven by improved economic conditions, sanitation, and health care.
  • Result: population grows initially, then stabilizes or declines; people live longer.
  • Five stages depicted in population pyramids: from young, triangular populations to older, rectangular populations.

🌍 Epidemiologic transition

Epidemiologic transition: an extension of the demographic transition; refers to how causes of death change as countries industrialize.

  • Agrarian societies: deaths from infectious diseases and reproductive complications.
  • Industrial/postindustrial societies: better sanitation and health care reduce infectious disease burden; people live longer and face noncommunicable diseases (heart disease, falls, chronic conditions).
  • Triple health burden in transitional periods: unfinished old set (communicable diseases), rising new set (chronic diseases, accidents), and lagging health care systems.

🌍 Epidemic curves

Epidemic curve (epi curve): a visual display of the onset of illness among cases associated with an outbreak.

  • X-axis: time frame (minutes, hours, days, weeks, months, years).
  • Y-axis: incidence of cases.
  • What you can learn: time trend, outliers, magnitude, pattern of spread, likely exposure period.
PatternDescriptionExample
SporadicRandom, isolated cases over timeCreutzfeldt-Jakob Disease
EndemicSteady, low-level presenceInfluenza in the U.S.
Epidemic (point source)Bell curve; single exposure eventFoodborne illness from a potluck
Epidemic (propagating)Waves; initial cases spread to othersMeasles outbreak
  • Don't confuse point source (one-time exposure) with propagating source (person-to-person spread over multiple generations).

🔑 Key takeaways for measurement

🔑 Choosing the right measure

  • Use counts for raw magnitude; use ratios for context and comparison.
  • Use prevalence to assess burden, allocate resources, or estimate frequency of exposure.
  • Use incidence to study disease etiology, estimate risk, and evaluate prevention programs.
  • Use cumulative incidence for closed cohorts with complete follow-up.
  • Use incidence rate for dynamic populations or when follow-up varies.

🔑 Avoiding common pitfalls

  • Always define the at-risk population carefully (exclude those who cannot have the outcome).
  • Adjust rates when comparing populations with different compositions (age, sex, etc.).
  • Recognize that changes in prevalence or incidence may reflect measurement improvements, treatment advances, or population shifts—not just true disease changes.
  • Ensure censoring is independent of the outcome when using K-M or CLT methods; otherwise, results will be biased.
3

Study Designs

3. Study Designs

🧭 Overview

🧠 One-sentence thesis

Study designs in epidemiology differ primarily in whether researchers control exposures and randomization, and this choice determines the strength of evidence for causality and the measures of association that can be calculated.

📌 Key points (3–5)

  • Two main categories: observational studies (no manipulation, no randomization) vs. experimental studies (controlled factors, often randomized).
  • Temporality matters: only some designs (cohort, RCT) can establish whether exposure came before outcome—critical for proving causation.
  • Common confusion: retrospective cohort vs. case-control—cohort starts with exposure status and follows forward through records; case-control starts with known disease and looks backward for exposures.
  • Measures of association: odds ratio (OR) for case-control; relative risk (RR) for cohort; prevalence rate ratio for cross-sectional.
  • Strength hierarchy: evidence strength increases from case series → ecological → cross-sectional → case-control → cohort → RCT → meta-analysis, though all types are important and build on each other.

🔬 Observational vs experimental studies

🔬 The fundamental split: control

Observational studies: researchers do not manipulate study factors and do not randomize; they observe what happens naturally in a group.

Experimental studies: researchers do control factors and often use randomization to create conditions that reveal causal effects between exposure and outcome.

  • "Manipulate" does not mean fabricating data; it means setting study parameters (e.g., who gets medication vs. placebo).
  • Example: observing factory workers' health (observational) vs. enrolling cancer patients in a drug trial with random assignment to treatment groups (experimental).

🎲 Randomization explained

  • Randomization = using objective criteria to assign participants to study groups.
  • Methods include order of clinic arrival, random number generators, etc.
  • Purpose: reduce bias by preventing researchers or participants from choosing groups based on preferences or characteristics.
  • Example: three groups—placebo, standard care, new drug—assigned by random number.

📊 Key design details to track

When evaluating any study design, pay attention to:

  • Number of observations made
  • Directionality of exposure (forward in time, backward, or snapshot)
  • Data collection methods and timing
  • Unit of observation (individual vs. group)
  • Availability of subjects

🗂️ Types of observational studies

🗂️ Ecological studies

Ecological study: uses group summary measures for exposure and outcome, not individual-level data.

  • Compares populations (e.g., disease rate in France vs. US, or US in 1950 vs. 2000).
  • Ecologic fallacy: incorrectly assuming group-level findings apply to individuals, or vice versa.
  • Example: if southern US states have high heat-illness rates at track meets, that does not mean every individual from the south has higher risk; group data ≠ individual risk.
  • Useful for testing or developing hypotheses at the population level.
  • Cannot determine temporality.

📸 Cross-sectional studies

Cross-sectional study (prevalence study): measures prevalence of disease and exposures at one point in time—a "snapshot."

  • Does not show when exposure or disease started, only that they are present right now.
  • Cannot determine temporality (did exposure cause disease, or vice versa?).
  • Good for: describing disease burden, generating hypotheses, planning health services.
  • Measure: prevalence rate ratio (comparing prevalence between groups or time points).
  • Example: during COVID-19, researchers measured prevalence of inflammatory heart disease among professional athletes who tested positive between May–October 2020.

🔍 Case-control studies

Case-control study: starts with people who already have the disease (cases), finds similar people without disease (controls), then looks backward to compare past exposures.

  • Goal: identify whether a particular exposure could have caused the disease.
  • Great for rare diseases and urgent situations (e.g., outbreaks).
  • Cannot determine temporality definitively (we know disease is present, but not the exact sequence).
  • Measure: odds ratio (OR) of exposure.
  • Example: Hospital A has many post-surgery infections; researchers identify infected patients (cases) and uninfected surgical patients (controls), then investigate which surgery type (ACL reconstruction vs. repair) is more common among cases.

🚶 Cohort studies

Cohort study: starts with a group based on exposure status, then follows to see if disease develops after exposure.

  • Goal: find out whether disease comes after a particular exposure.
  • Great for rare exposures.
  • Can determine temporality (at least two data collection points that do not overlap).
  • Three types: prospective (set up today, follow forward), retrospective (set up today, look at past records), historical (combination).
  • Measure: relative risk (RR), sometimes OR.
  • Example: identify all patients at Hospital A eligible for ACL surgery, determine surgery type (exposure), check if they had infection before surgery (exclude if yes), then check if infection occurred after surgery.

🆚 Don't confuse: retrospective cohort vs. case-control

FeatureRetrospective cohortCase-control
Starts withExposure status (e.g., surgery type)Disease status (e.g., infection present/absent)
FollowsForward through records to see if outcome occurredBackward to identify possible exposures
QuestionDoes exposure lead to more disease?What exposures are associated with known disease?
TemporalityYes (two measurements: before and after)No (one measurement: exposure + disease status at study start)
ExampleAll ACL surgery patients → check infection after surgeryAll infected patients → check which surgery type they had

🧪 Experimental studies

🧪 Community trials

Community trial (community intervention study): evaluates community-level interventions, policies, or behavior changes; randomization occurs at the community level.

  • Useful for policy evaluation, health behavior interventions.
  • Challenges: hard to control everything (people move in/out), impossible to ensure 100% participation.
  • Can establish causality.
  • Example: communities with fluoridated water have better oral health than those without.

💊 Clinical trials (RCTs)

Randomized controlled trial (RCT): tests efficacy of new medications, therapies, treatments, or preventatives; randomization occurs at the individual level.

  • Gold standard for establishing causality.
  • Researchers manipulate exposure and control all other variables.
  • Can use crossover design (same participants serve as both cases and controls at different times).
  • Multiphase structure (Phase 0: initial efficacy; Phase I: safety; Phase II: does it work?; Phase III: improvement in condition?; Phase IV: post-market surveillance).
  • Disadvantage: highly controlled environment may not reflect real-world effectiveness.
  • Example: Drug B reduces atrial fibrillation more than standard care in a Phase III trial.

⏱️ Temporality and directionality

⏱️ Why temporality matters

  • To establish causality, we must show exposure came before disease.
  • Not all studies measure temporality; not all are intended to.
  • Cohort studies and RCTs are best for answering "which came first?"

⏱️ Directionality by design

DesignData collection startsDirectionTime pointsTemporality?
Cross-sectionalPresentSnapshot (all at once)1No
Case-controlPresentBackward (recall past exposures)1No
Retrospective cohortPresentBackward through records, then forward≥2Yes
Prospective cohortPresentForward into future≥2Yes
RCTPresentForward (with random assignment)≥2Yes
  • Cross-sectional: all questions asked at once; cannot tell if exposure preceded disease.
  • Retrospective cohort: use existing records to see exposure status before diagnosis, then check outcome after.
  • Case-control: start with known disease, look back for exposures; cannot definitively prove exposure came first.
  • Prospective cohort: identify exposure today, follow forward to see who develops disease.
  • RCT: randomly assign exposure today, follow forward; strongest evidence for causality.

📐 Measures of association

📐 The 2×2 table

All measures of association use a 2×2 table:

Outcome (+)Outcome (−)Total
Exposed (+)ABA + B
Exposed (−)CDC + D
TotalA + CB + DA+B+C+D
  • A: has outcome and exposure
  • B: exposure but no outcome
  • C: outcome but no exposure
  • D: neither outcome nor exposure

Beware: not everyone sets up tables the same way (exposure in rows vs. columns); always check before calculating.

📐 Odds ratio (OR)

Odds ratio: the probability of being exposed among cases compared to the probability of being exposed among controls.

  • Formula (case-control): OR = (A/C) ÷ (B/D) = (A×D) / (B×C) [cross-product ratio]
  • Interpretation:
    • OR ≈ 1 (0.9–1.1): no difference in exposure between outcome groups
    • OR > 1.1: group with outcome more likely to have exposure
    • OR < 0.9: group with outcome less likely to have exposure
  • Always specify comparison: "Cases are 3.2 times more likely to have the exposure compared to controls."
  • Can also calculate OR in cohort/cross-sectional/RCT, but interpretation differs (OR of disease vs. OR of exposure).

📐 Relative risk (RR)

Relative risk: the risk (incidence) of outcome in the exposed compared to the risk in the unexposed.

  • Formula (cohort): RR = [A/(A+B)] ÷ [C/(C+D)]
  • Interpretation:
    • RR ≈ 1 (0.9–1.1): no difference in risk between exposure groups
    • RR > 1.1: exposed group more likely to develop disease
    • RR < 0.9: exposed group less likely to develop disease
  • Example: "Exposed group has 2.5 times the risk of disease compared to unexposed group."

📐 Prevalence rate ratio (PRR)

Prevalence rate ratio: compares prevalence between two groups or the same group at different times (used in cross-sectional studies).

  • Formula: PRR = (prevalence in group 1) ÷ (prevalence in group 2)
  • Name is a misnomer (prevalence is a proportion, not a rate), but formula is familiar.
  • Example: injury prevalence in Oklahoma vs. Texas, or Virginia in 2015 vs. 2020.

📊 Measures of effect

📊 Attributable risk (AR)

Attributable risk (risk difference): of everyone with the exposure, how much of the disease occurrence is due to the exposure itself?

  • Formula: AR = [risk in exposed] − [risk in unexposed] = [A/(A+B)] − [C/(C+D)]
  • This is an absolute measure (a difference, not a ratio).
  • AR% = (AR / risk in exposed) × 100 = percentage of risk in exposed group attributable to exposure.
  • Example: AR = 0.625 − 0.500 = 0.125; AR% = (0.125/0.625)×100 = 20%. "20% of ankle sprains in racquet-sport players are due to playing racquet sports."
  • If intervention works (RR < 1), AR will be negative.

📊 Clinical measures: RRR, ARR, NNT, NNH

MeasureFormulaRoundingMeaning
Relative risk reduction (RRR)1 − RRHow much does intervention reduce risk, relative to control?
Absolute risk reduction (ARR)|risk in control − risk in intervention|Absolute difference in risk between groups
Number needed to treat (NNT)1 / ARRUpHow many must be treated to benefit one patient?
Number needed to harm (NNH)1 / ARR (when AR positive)DownHow many exposed to harm one patient?
  • ARR is broader than AR: compares intervention vs. control (not just exposed vs. unexposed).
  • Vertical bars | | mean absolute value (ignore negative sign for calculation, but remember direction).
  • Example: Intervention reduces ACL injury risk from 0.21 to 0.09. RR = 0.09/0.21 = 0.43. RRR = 1−0.43 = 0.57 (57% reduction). ARR = 0.21−0.09 = 0.12. NNT = 1/0.12 ≈ 8.3 → round up to 9. "Treat 9 athletes to prevent 1 ACL injury."

📊 Population attributable risk (PAR)

Population attributable risk: the absolute level of risk in the whole population due to the exposure, including those without the exposure.

  • Formula: PAR = (risk in exposed − risk in unexposed) × (number exposed / total population)
  • Answers: "If we eliminate exposure from the entire population, how much disease burden disappears?"
  • Example: implementing ACL injury prevention in all NCAA women's basketball players reduces total noncontact ACL injury burden by <1% in that population—works well individually but not as a population-level intervention.

🦠 Outbreak investigations

🦠 Definitions

Outbreak: disease occurrence in an area exceeding the normally expected number of cases (limited geographic area).

Epidemic: outbreak declared by national health bodies (e.g., CDC); broader than outbreak.

Endemic: disease occurring at expected level; normally present in that place.

Pandemic: epidemic spread over multiple countries/continents; declared by WHO.

🦠 The 11 steps

  1. Establish existence of outbreak: calculate prevalence/incidence; rule out false positives, lab errors, changes in surveillance, population shifts.
  2. Verify diagnosis: confirm all cases have the correct diagnosis (e.g., meningitis A, not B).
  3. Construct working case definition: standard criteria to classify someone as a case (clinical + lab + restrictions by time/place/person); emphasize sensitivity over specificity (better to include too many than miss true cases).
  4. Find cases systematically: use a line listing (table with one row per case, columns for demographics, exposure, outcome, dates).
  5. Perform descriptive epidemiology: look for patterns; calculate attack rate (percentage of exposed who are ill) for foodborne outbreaks.
  6. Develop hypotheses: about cause, risk factors, interventions.
  7. Evaluate hypotheses: test using measures of association.
  8. Reconsider, refine, reevaluate: iterate as new data arrive.
  9. Compare with lab/environmental studies: reconcile findings.
  10. Implement control/prevention: vaccine distribution, recall products, stop access to dangerous substances.
  11. Initiate/maintain surveillance: ongoing systematic data collection (e.g., CDC's MMWR, WONDER).

🦠 Attack rate (foodborne outbreaks)

Attack rate: percentage of those at risk who are actually ill.

  • Formula: (number ill with exposure) / (total exposed)
  • Example: 48 ate salad and got sick, 20 ate salad and stayed well → attack rate = 48/(48+20) = 48/68 = 70.6%. "70.6% of salad eaters are sick."
  • Compare attack rates across exposures to identify likely cause.

🦠 Case definitions: confirmed vs. possible

  • Confirmed case: meets all diagnostic criteria (clinical + lab).
  • Possible/probable case: meets several criteria but missing some tests or features.
  • Example: hemophagocytic lymphohistiocytosis (HLH) requires 5 of 8 criteria; if patient has only 3 criteria and missing tests, classify as "possible."

📋 Reporting standards

📋 Why standards matter

  • Help others understand study methods, potential biases, and how to replicate.
  • Different standards for different study types.

📋 Common standards

AcronymFull nameUse
CONSORTConsolidated Standards of Reporting TrialsRCTs
STROBEStrengthening the Reporting of Observational Studies in EpidemiologyObservational studies
STARDStandards for Reporting Studies of Diagnostic AccuracyDiagnostic tests
PRISMAPreferred Reporting Items for Systematic Reviews and Meta-AnalysesSystematic reviews/meta-analyses
CAREConsensus-based Clinical Case Reporting Guideline DevelopmentCase reports
SQUIREStandards for Quality Improvement Reporting ExcellenceQuality improvement
  • More standards available at EQUATOR network website.
  • Example: CARE Consortium published methods article in 2017 describing how they built a national concussion study with the Department of Defense.

🔢 Summary tables

🔢 Study design comparison (observational)

DesignTemporality?UnitGood forMeasureAdvantageDisadvantage
Case seriesNoIndividualDescribe interesting casesNoneShare info, develop hypothesesNot enough detail for treatment decisions
EcologicalNoGroupTest/develop hypothesesCorrelation, χ²Quick, cheapEcologic fallacy, imprecise
Cross-sectionalNoIndividualBurden of disease, generate hypothesesPrevalence rate ratioQuick, cheapNot good for rare diseases or etiology
Case-controlNoIndividualRare diseases, outbreaks, test hypothesesOdds ratioGreat for rare outcomes, cheap, fastNot good for rare exposures, recall bias, no direct risk measure
CohortYesIndividualEtiology, rare exposures, temporal relationshipsRelative risk, ORDirect risk measure, shows temporalityExpensive, time-consuming, not good for rare diseases

🔢 Study design comparison (experimental)

DesignTemporality?UnitGood forAdvantageDisadvantage
Community trialYesCommunityCommunity interventions, policy evaluationRandomization, can establish causalityHard to control all variables, not everyone participates
Clinical trial (RCT)YesIndividualTest new medications/therapies/vaccinesRandomization, full control, can establish causalityHighly controlled → uncertain real-world applicability

🔢 Measures by design

DesignMeasures of diseaseMeasures of riskTemporality
EcologicalPrevalence (rough)Prevalence ratioRetrospective
Cross-sectionalPoint/period prevalenceOR, prevalence OR, prevalence ratio, prevalence differenceRetrospective
Case-controlNoneOdds ratioRetrospective
CohortPoint/period prevalence, incidenceOR, prevalence OR, prevalence ratio, prevalence difference, attributable risk, incidence rate ratio, RR, risk ratio, hazard ratioRetrospective, prospective, or both
4

Diagnostics and Screening

4. Diagnostics and Screening

🧭 Overview

🧠 One-sentence thesis

Diagnostic and screening tests are prevention tools that help clinicians determine disease presence and guide treatment decisions, but their utility depends on balancing sensitivity and specificity while understanding how test characteristics and biases affect clinical interpretation.

📌 Key points (3–5)

  • Purpose of screening: Identify unrecognized disease early (before symptoms) to alter its natural course and improve outcomes.
  • Test validity measures: Sensitivity (ruling out disease), specificity (ruling in disease), PPV, and NPV quantify how well tests perform.
  • Trade-offs in test design: Increasing sensitivity reduces specificity and vice versa; the optimal balance depends on disease consequences and prevalence.
  • Common confusion: Sensitivity/specificity are fixed test properties, but PPV/NPV vary with disease prevalence in the population being tested.
  • Screening biases: Lead-time bias, length bias, and selection bias can make screening appear more effective than it actually is.

🎯 Purpose and applications

🎯 What diagnostic and screening tests do

Diagnostic and screening tests are primary and secondary prevention tools.

  • Classify patients into categories:
    • Diseased vs. nondiseased
    • Positive vs. negative
    • High vs. low risk
    • Exposed vs. unexposed
  • Measure disease burden in populations (prevalence)
  • Answer three clinical questions:
    • How should we treat individual patients?
    • How does the test affect study results?
    • Did false results bias the study we're reviewing?

⚠️ Potential problems with tests

  • Tests can produce incorrect or false results
  • Consequences of errors:
    • Treating patients who don't need it
    • Withholding treatment from those who do need it
    • Irreversible decisions (e.g., selective abortion, suicide after false-positive HIV test)
    • Decisions that may not be acceptable to the population served

🔍 Screening programs

🔍 What screening means

Screening for disease is the presumptive identification of unrecognized disease or defects by the application of tests, examinations, or other procedures that can be applied rapidly.

  • Positive screening results are followed by diagnostic tests to confirm actual disease
  • Example: Newborn tests positive for PKU screening → phenylalanine loading test confirms PKU
  • Common screening tests: Pap smear, mammogram, blood pressure, cholesterol, vision tests, urinalysis

🏛️ Social considerations

  • The health problem should be important for individual and community
  • Diagnostic follow-up and intervention must be available to all who need them
  • Must have a favorable cost-benefit ratio
  • Public acceptance must be high

🔬 Scientific considerations

  • Natural history of the condition should be adequately understood
  • This knowledge permits identification of early stages and appropriate biomarkers
  • Knowledge base exists for prevention efficacy and side effects
  • Prevalence of the disease or condition is high

⚖️ Ethical considerations

  • The program can alter the natural history in a significant proportion of those screened
  • Always ask: Can you do anything about changing the course of disease? If not, screening may do more harm than good
  • Must have suitable and acceptable tests for screening and diagnosis
  • Must have acceptable, effective methods of prevention

✅ Characteristics of a good screening test

A screening test should be:

CharacteristicDescription
SimpleEasy to learn and perform
RapidQuick to administer; results available rapidly
InexpensiveGood cost-benefit ratio
SafeNo harm to participants
AcceptableTo target group

A test may not meet all five criteria, but this should be a goal.

📊 Validity and reliability

📊 Core definitions

Internal validity is accuracy. It describes the ability of a measuring instrument to give a true measure.

Reliability is precision. It describes the ability of a measuring instrument to give consistent results on repeated trials.

  • Internal validity can be evaluated only if an accepted and independent method (gold standard) exists for confirming the test
  • Reliability measures consistency among repeated measurements of the same individual on more than one occasion

🎯 Precision vs. accuracy visualized

Think of a dartboard:

  • Precision (reliability): Darts clustered together in the same place, thrown repeatedly to the same spot
  • Accuracy (validity): Darts hit the bull's-eye (the true target)
  • Ideal research: Precise AND accurate (darts clustered at bull's-eye)
  • Worst case: Neither precise nor accurate (darts scattered, missing bull's-eye)

Key relationships:

  • Random error decreases precision
  • Systematic error decreases accuracy
  • Higher precision → lower standard deviation → higher statistical power

🚫 What reduces reliability and validity

Measurement bias: Constant errors from a faulty measuring device (e.g., miscalibrated blood pressure manometer) → reduces reliability

Halo effect: Observer's perception of patient characteristics influences observations, including knowledge of previous findings

  • Example: Health provider rates sexual behavior based on opinion about patient characteristics without obtaining specific current information

Social desirability: Respondent answers in a manner that agrees with socially desirable norms

  • Example: Teenage boys exaggerate frequency of sexual activities because it's perceived as "cool" among peers

🧮 Measuring test performance

🧮 The 2×2 table structure

Tests are evaluated using a table:

  • Columns: Results from the gold standard (disease present or absent)
  • Rows: Results from the new/comparison test (positive or negative)
  • Cells: True positives (TP), false positives (FP), false negatives (FN), true negatives (TN)

🎯 Sensitivity (true-positive rate)

The percent of positives identified by the screening test that are truly positive.

  • Calculation: TP / (TP + FN)
  • The higher this number, the more people correctly identified as having the outcome
  • High sensitivity helps rule OUT disease when test is negative
  • SnNOUT: When a highly Sensitive test is NEGATIVE, it rules OUT disease

🎯 Specificity (true-negative rate)

The percent of negatives identified by the screening test that are truly negative.

  • Calculation: TN / (FP + TN)
  • The higher this number, the more people correctly identified as not having the outcome
  • High specificity helps rule IN disease when test is positive
  • SpPIN: When a highly Specific test is POSITIVE, it rules IN disease

📈 Positive predictive value (PPV)

The percent of positive results from the screening test that are true positives.

  • Calculation: TP / (TP + FP)
  • Tells you: "If my patient tests positive, what's the chance they actually have the disease?"

📉 Negative predictive value (NPV)

The percent of negative results from the screening test that are true negatives.

  • Calculation: TN / (FN + TN)
  • Tells you: "If my patient tests negative, what's the chance they actually don't have the disease?"

⚖️ Fixed vs. variable properties

Sensitivity and specificity are FIXED properties:

  • They tell us how good a test is
  • They remain the same no matter when the test is run

PPV and NPV VARY depending on disease prevalence:

  • If prevalence is high: PPV is high, NPV is low
  • If prevalence is low (rare disease): PPV is low, NPV is high
  • They tell us how to interpret our patient's test results

🔄 Optimizing sensitivity and specificity

Imagine two overlapping bell curves (diseased vs. non-diseased populations):

Moving the cutoff LEFT (increase sensitivity):

  • Identify more positive cases
  • Good at ruling OUT disease
  • Trade-off: More false positives, lower specificity
  • Use when: Disease has high consequences (HIV, cancer)—don't want to miss any cases
  • Follow up with confirmatory tests for false positives

Moving the cutoff RIGHT (increase specificity):

  • Fewer false positives
  • Good at ruling IN disease
  • Trade-off: More false negatives, lower sensitivity
  • Use when: Misdiagnosis is less acceptable or treatment has significant risks

Don't confuse: The same test can be adjusted for different purposes by changing the threshold for what counts as "positive."

📊 Likelihood ratios

Likelihood ratio: How much more likely a particular test result is for people with the disease compared to those without.

LR+ (positive likelihood ratio):

  • Calculation: Sensitivity / (1 - Specificity) = TP rate / FP rate
  • LR+ > 10 indicates a highly specific test

LR- (negative likelihood ratio):

  • Calculation: (1 - Sensitivity) / Specificity = FN rate / TN rate
  • LR- < 0.1 indicates a highly sensitive test

This is very important in clinical decision-making.

🔧 How to improve sensitivity and specificity

  • Retrain people doing measurements (reduces misclassification in tests requiring human assessment)
  • Recalibrate the screening instrument (reduces imprecision in tools like scales)
  • Use a different test
  • Use more than one test
  • Use visuals to help participants choose valid answers (e.g., pain scales with faces or colors instead of subjective descriptions)

⚠️ Errors and biases

⚠️ Type I and Type II errors

Type I error (false positive):

  • Telling a patient they have the disease when they don't
  • Example: Telling a patient with an unbroken leg that it's broken
  • Consequences: Unnecessary treatment, anxiety, cost

Type II error (false negative):

  • Telling a patient they don't have the disease when they do
  • Example: Telling a patient with a broken leg that it's not broken
  • Consequences: Delayed treatment, potential for further harm, loss of trust
  • Often more egregious than Type I errors

🎭 Sources of bias in screening

Selection bias:

Incorrectly picked the population to study, resulting in a nonrepresentative study group.

  • Example: Women and men with family history are more likely to volunteer for breast cancer study than those without—these groups have differing risk levels
  • Strategies to reduce: Randomize; be strategic in recruitment; ensure study population is representative; include relevant factors (like family history) as study variables

Lead-time bias:

When earlier detection of disease looks like it leads to increased survival over those not diagnosed earlier.

  • Example: Two patients die at 68 of lung cancer. One diagnosed at 50 with screening, other symptomatic at 65. Earlier diagnosis creates illusion of longer survival.
  • Strategies to reduce: Adjust survival time based on disease severity at diagnosis; compare stage-matched patients

Length-time bias:

Screening is more effective if the disease is latent longer compared to if the disease has a short latency period.

  • Example: Patients with slow-growing tumors are overrepresented in screened populations, leading to overestimation of survival for patients with fast-growing tumors
  • Strategies to reduce: Randomize patients to screened vs. not screened groups; calculate survival time for both groups to find true differences

Don't confuse: Lead-time bias is about when you detect disease; length-time bias is about which cases you're more likely to detect.

🌱 Natural history of disease

🌱 Why natural history matters

A screening test is really useful only if there is something we can do to change the natural history of the disease:

  • Increase survival
  • Change quality of life
  • Eliminate disease better
  • Or something similar

⏱️ Lead time vs. actual survival

Scenario: Two patients, disease begins at age 50 for both

Patient A (screened):

  • Asymptomatic at 55, screened and diagnosed
  • Treatment begins immediately
  • Dies at 70
  • Survival time: 15 years (from diagnosis to death)
  • Had 5 years of lead time (early detection before symptoms)

Patient B (not screened):

  • Becomes symptomatic at 60, diagnosed then
  • Dies at 70
  • Survival time: 10 years (from diagnosis to death)

Analysis:

  • Both had disease at same point (age 50)
  • Both died at same age (70)
  • Adjusted survival is the same for both
  • Assuming Patient A lived longer would be lead-time bias
  • However, Patient A may have had better quality of life during illness due to early treatment

True benefit scenario: If Patient A died at 75 instead of 70, that would indicate:

  • Actual longer survival (not just lead time)
  • Screening genuinely changed natural history
  • Both longer survival AND better quality of life indicate screening is beneficial

Don't confuse: More years from diagnosis to death (survival time) is not the same as actually living longer (true survival benefit). Lead time can create the illusion of benefit without actual benefit.

5

The Wrecking Ball: Bias, Confounding, Interaction and Effect Modification

5. The Wrecking Ball: Bias, Confounding, Interaction and Effect Modification

🧭 Overview

🧠 One-sentence thesis

Three major threats—bias, confounding, and effect modification—can distort study results and lead researchers to incorrect conclusions about the relationship between exposures and outcomes, but each can be identified and controlled through careful study design and analysis.

📌 Key points (3–5)

  • Three distinct threats: Bias is a systematic design error; confounding is a real third factor that distorts relationships for everyone equally; interaction/effect modification is a third factor that affects different groups differently.
  • Bias comes in two main forms: selection bias (error in choosing participants) and information bias (error in collecting or classifying data), both of which compromise validity.
  • Confounding requires three criteria: the third factor must not be in the causal pathway, must relate to both exposure and outcome, and must be unequally distributed across comparison groups.
  • Common confusion—confounding vs. effect modification: If stratum-specific measures are similar to each other but different from the crude measure, confounding is present; if stratum-specific measures differ from each other, effect modification is present.
  • Prevention is key: Most threats are best addressed through careful study design, clear protocols, validated measures, and appropriate analytic procedures rather than after-the-fact corrections.

🎯 What can explain an association

🎯 Four possible explanations

When a study finds an association (e.g., a relative risk of 4.1), four things might explain it:

ExplanationWhat it meansHow to address it
Random variabilityChance fluctuationStatistical precision estimates (p-value, confidence interval)
Causal relationshipTrue cause-and-effectBradford Hill criteria, randomization, regression, rule out alternatives
BiasSystematic design errorStandardize questions, use objective data, validated measures
ConfoundingThird factor distortionAdjustment, restriction, randomization, matching, regression

🔍 The validity connection

Validity: Whether study results are correct on average.

  • Valid study = designed and executed correctly → unbiased results that are correct on average
  • Invalid study = design or execution errors → biased results that are incorrect on average

Example: The famous 1950 Doll and Hill paper proving smoking causes lung cancer had to carefully rule out bias and confounding before concluding a causal relationship existed.

🚨 Bias: systematic design errors

🚨 Core definition and prevention

Bias: A systematic error in how a study is designed.

Bias creates a false difference in relationships. Unlike confounding (which is real), bias is an artifact of poor design.

Three prevention strategies:

  • Use appropriate study design
  • Establish valid and reliable data collection methods
  • Use appropriate analytic procedures

Don't confuse: Bias cannot be confirmed in a single study (would require infinite studies to see the truth), so prevention through good design is essential.

🎭 Selection bias: who gets picked

Selection bias: When individuals have different probabilities of being selected according to their exposure and outcome status.

Impact:

  • Compromises external validity (results don't apply to other populations)
  • Compromises internal validity (results inaccurately represent the actual relationship)

Common types:

TypeDefinitionExample
Volunteer biasVolunteers differ clinically from non-volunteersPeople with family history of cancer more likely to volunteer for breast cancer prevention study
Sampling biasCertain people have greater selection chancePicking friends as participants due to convenience
Survivorship biasOnly survivors are selectedCOVID-19 study ten years later captures only those with milder disease who survived
Attrition biasPeople who leave differ from those who stayParticipants with comorbidities leave early, leaving healthier participants
Non-response biasNon-responders differ from respondersOlder, less computer-savvy people less likely to respond to email invitations

📊 Information bias: how data is collected

Information bias: When people systematically get placed in the wrong classification group for exposure and/or outcome (misclassification).

Two subtypes:

Non-differential misclassification: Misclassification is unrelated to disease/exposure status (affects everyone the same way regardless of group).

Differential misclassification: Misclassification is related to disease/exposure status (affects groups differently).

Example: If study interviewers know a participant is an athlete and select "yes, concussion" no matter what the participant says, this is measurement bias (a type of information bias).

Common information bias types:

TypeWhat happensExample
Recall biasParticipants can't remember past informationPatients can't recall their blood type and randomly select an incorrect one
Measurement biasData not accurately measured or categorizedInterviewers who know participant status influence classification
Observer-expectancy biasResearchers influence responsesResearcher suggests the "correct" answer to a participant
Response biasParticipants answer based on social acceptabilityPatients overreport fruit/vegetable consumption
Procedure biasStudy administration creates pressureSurvey about supervisor completed in front of supervisor

🛡️ Checking for bias: key questions

Selection bias questions:

  • Was the study population clearly defined?
  • What were inclusion and exclusion criteria?
  • Were refusals and losses kept to minimum?
  • In cohort studies: Were groups similar except for exposure? Was follow-up adequate and similar?
  • In case-control studies: Did controls represent the population cases came from?

Information bias questions:

  • Were measurements as objective as possible?
  • Were subjects and observers blind?
  • Were observers rigorously trained?
  • Were written protocols used to standardize data collection?
  • Was patient-provided information validated against records?
  • Were measurement methods validated?

🍎 Case-based control selection

Case-based control selection: Selecting controls from the same pool of people that cases come from.

Why it matters: People who participate in health screenings (e.g., mammograms) differ systematically from non-participants (more likely to have family history, different age profiles). Comparing them to the general population is like comparing apples to oranges.

Solution: Select controls from the same screening population (Granny Smith apples vs. Gala apples).

Example: For a study on long-term effects of ankle injuries from sport using ER records, reframe the research question to "long-term effects of ankle injuries from sport that seek treatment in the ER" to reduce selection bias. This is compensating bias—attempting to equalize bias across comparison groups.

🔄 Inclusion and exclusion criteria

Inclusion criteria: Definitive list of characteristics participants must have to enroll.

Exclusion criteria: List of characteristics participants must not have.

Minimal example:

  • Inclusion: Children who go to Alpha Elementary School
  • Exclusion: Children who attend any other school

Detailed example:

  • Inclusion: Children ages 5–7, in kindergarten or first grade, lived in town Alpha since birth, bring own lunch
  • Exclusion: Children younger than 5 or older than 7, not born in Alpha, not lived in Alpha entire life, eat school meals or don't eat meals

Clear criteria minimize selection bias by defining exactly who should and should not be in the study.

🌀 Confounding: the real third factor

🌀 Core definition and criteria

Confounding: A third factor that makes you misinterpret the relationship between an exposure and an outcome.

Key distinction: Unlike bias (a design error), confounding is a real factor in the relationship. The confounder is unequally distributed across the population and affects everyone the same way.

Three criteria for a confounder:

  1. Not in the causal pathway: The exposure does not lead to this factor, which then leads to the outcome
  2. Related to both exposure and outcome:
    • Relationship to exposure can be causal or non-causal
    • Relationship to outcome must be causal (confounder causes outcome)
  3. Unequally distributed: The factor's level differs across comparison groups (if distribution were equal, no confounding because influence would be identical)

Don't confuse: A confounder cannot be caused by the disease, doesn't have to be a causal risk factor, but must predict future disease development.

🔍 Finding confounders

Four strategies:

  • Find a subject matter expert
  • Review the literature
  • Think outside the box
  • Draw out possible causal relationships using:
    • Directed Acyclic Graph (DAG): conceptual representation of relationship series
    • Web of causation: conceptual representation of relationship series

Example: A web of causation for maternal and infant mortality shows how structural determinants (slavery, Jim Crow, redlining) shape social determinants (food stability, education, income, housing), which through multiple interconnected pathways lead to increased mortality rates and health inequities.

📏 Stratification: assessing and controlling confounding

Stratification: A tool that allows researchers to look at how results change depending on comparison groups.

Stratification can both assess (look for) and control for/adjust (take care of) confounding.

Other control methods:

  • Restriction (design phase)
  • Matching (design phase)
  • Regression (analysis phase)

Five stratification steps:

Step 1: Calculate crude measure

  • Calculate the measure of association (e.g., OR, RR) considering only exposure and outcome
  • This is the crude or unadjusted measure
  • Example: OR_crude = 3.2 for school sport participation and ankle sprains

Step 2: Calculate stratum-specific measures

  • Calculate the same measure separately for each level of the third factor
  • Example: OR_high income = 4.0; OR_low income = 3.8

Step 3: Compare stratum-specific measures to each other and to crude

  • Use the "eyeball method": Are they within 10% of each other?
  • Example: 10% of 4.0 = 0.4; range is 3.6 to 4.4; 3.8 falls within this range → stratum-specific measures are similar
  • Also check if crude OR is similar to both stratum-specific estimates
  • Example: 3.2 is outside both ranges → crude differs from both

Decision rules at Step 3:

  • If stratum-specific measures are not similar to each other (especially if on opposite sides of crude): STOP → effect modification/interaction likely; report only stratum-specific measures
  • If stratum-specific measures are similar to each other and to crude: confounding unlikely; STOP
  • If stratum-specific measures are similar to each other but different from crude: proceed to Step 4

Step 4: Calculate adjusted measure

  • Calculate a pooled measure that accounts for the strata
  • Often use Mantel-Haenszel (M-H) estimate (for OR, RR, or other measures)
  • Example: M-H OR = 3.9

Step 5: Compare adjusted measure to crude

  • Use eyeball method again
  • Example: 10% of 3.9 = 0.39; range is 3.51 to 4.29; crude OR of 3.2 is outside this range
  • If similar: no confounding → report crude OR
  • If different: confounding likely present → report adjusted measure (M-H OR)

Example conclusion: "There is a positive association between school sport participation and ankle sprains when adjusting for family income (M-H OR = 3.9)."

⚡ Interaction and effect modification: different effects for different people

⚡ Core distinction

Effect modification: When the effect of the exposure on the outcome is modified by the level of a third factor (the effect modifier).

Interaction: When the observed joint effect of a risk factor and the third factor is greater than the expected effect from individual effects.

Similarity: Both refer to a third factor that influences the exposure-outcome relationship differently for different people.

Key difference:

  • Effect modification = biological interaction (antagonism and synergy); based on homogeneity and heterogeneity
  • Interaction = statistical interaction (additive and multiplicative)

In simple terms: Effect modification is about biology; interaction is about statistics.

Don't confuse: Journal articles often use these terms interchangeably, so interpret results carefully.

🔬 Why it matters

Three applications:

  1. Identify health disparities: Find which groups have much higher disease risk or poor outcomes
  2. Guide interventions: Different groups may need different prevention or treatment approaches
  3. Pharmacology: Understand how drugs or treatments interact with each other

🎯 Effect modification example: breast cancer

Scenario: Crude odds of developing breast cancer if living in the United States = 8.3

After stratification by sex assigned at birth:

  • People assigned female at birth: OR = 12.3
  • People assigned male at birth: OR = 2.5

Interpretation: These are very different risk profiles. Following stratification rules, the stratum-specific measures differ from each other → confounding does not appear present, but effect modification is present.

Reporting:

  • "The probability of developing breast cancer is higher for people assigned female at birth living in the United States than those living elsewhere (OR_women = 12.3)."
  • "People assigned male at birth living in the United States also have higher probability compared to those elsewhere (OR_men = 2.5), though lower than people assigned female at birth."

Impact: Changes intervention approaches, treatment choices, prevention methods, and patient discussions. Do not combine these groups or analysis and interpretation will be erroneous.

📋 Summary and key takeaways

📋 Bias prevention

  • Set guidelines and stick to them
  • Be careful about participant selection
  • Be careful about comparison group selection
  • Be careful about information collection to avoid misclassification

📋 Confounding management

  • Confounding is real—think about all possible relationships during design and analysis
  • Failing to address confounding results in erroneous conclusions
  • Use stratification, restriction, matching, or regression to control

📋 Effect modification recognition

  • Can reveal which patients benefit from particular therapies
  • If effect modification exists, do not combine groups that are different
  • Report stratum-specific results separately to avoid errors in analysis and interpretation