What Are Statistics
What Are Statistics
🧭 Overview
🧠 One-sentence thesis
Statistics is not merely a collection of numerical facts but a comprehensive discipline for analyzing, interpreting, displaying, and making decisions based on data—requiring careful attention to how numbers are chosen and interpreted to avoid misleading conclusions.
📌 Key points (3–5)
- Statistics as facts vs. discipline: Statistics includes numerical facts (e.g., earthquake measurements, demographic data) but more broadly refers to techniques for analyzing and interpreting data.
- Numbers can be right, interpretations wrong: The same numerical data can lead to flawed conclusions if context, causation, or confounding factors are misunderstood.
- Common confusion: correlation vs. causation: A statistical relationship between two variables does not prove one causes the other; third variables or temporal effects may be responsible.
- Incomplete information misleads: Percentages and comparisons without baseline rates or full context can create false impressions about trends or societal changes.
- Why it matters: Understanding statistics properly is essential because misinterpretation leads to incorrect decisions and false beliefs about cause-and-effect relationships.
📊 Two faces of statistics
📊 Statistics as numerical facts
The excerpt opens with examples of statistics as concrete numbers:
- Earthquake magnitudes (9.2 on Richter scale)
- Crime ratios (men commit murder at 10× the rate of women)
- Health statistics (1 in 8 South Africans HIV positive)
- Demographic projections (15 elderly per newborn by 2020)
These are descriptive facts and figures—quantitative statements about the world.
🔬 Statistics as a discipline
In the broadest sense, "statistics" refers to a range of techniques and procedures for analyzing, interpreting, displaying, and making decisions based on data.
- The study involves both mathematical calculations and critical thinking about how numbers are chosen and interpreted.
- It's not just computation; it requires understanding context, sources of bias, and alternative explanations.
- Don't confuse: Statistics (the discipline) with statistics (individual numbers)—the former is the methodology for working with the latter.
⚠️ Three common interpretation errors
⚠️ History effects (temporal confounding)
Scenario from excerpt: Ice cream sales increased 30% in the three months after a new advertisement launched in late May, leading to the conclusion that the ad was effective.
The flaw:
- Ice cream consumption naturally rises in June, July, and August regardless of advertising.
- This is called a history effect: outcomes are attributed to one variable (the ad) when another variable related to the passage of time is actually responsible.
- Example: Any ice cream brand would likely see increased sales in summer months; the ad may have had no effect at all.
🔗 Third-variable problem (spurious correlation)
Scenario from excerpt: Cities with more churches have more crime, leading to the conclusion that churches lead to crime.
The flaw:
- Both variables are caused by a third factor: population size.
- Bigger cities have both more churches and more crime simply because they have more people.
- Don't confuse: A correlation (two things occurring together) with causation (one causing the other).
- The excerpt notes this will be discussed in detail in Chapter 6, emphasizing that people erroneously believe in a causal relationship between two variables when a third variable causes both.
📉 Missing baseline information
Scenario from excerpt: Interracial marriages increased 75% compared to 25 years ago, leading to the conclusion that society now accepts interracial marriages.
The flaw:
- The percentage increase is meaningless without knowing the baseline rate.
- If only 1% of marriages were interracial 25 years ago, a 75% increase means 1.75% now—hardly evidence of widespread acceptance.
- Additional missing information: Has the rate fluctuated over the years? Is this year actually the highest?
- Key lesson: Relative changes (percentages) can be dramatic even when absolute numbers remain very small.
🧮 What statistics requires
🧮 Math and calculation
- The study involves mathematical computations with numbers.
- Quantitative skills are necessary but not sufficient.
🤔 Critical interpretation
The excerpt emphasizes that statistics "relies heavily on how the numbers are chosen and how the statistics are interpreted."
What this means:
- Where did the data come from? (sampling, measurement)
- What context is needed to understand the numbers?
- What alternative explanations exist?
- What information is missing?
The central warning: "You will find that the numbers may be right, but the interpretation may be wrong."
| Aspect | What's needed | Why it matters |
|---|---|---|
| Calculation | Math skills | Get the numbers right |
| Selection | Understanding data sources | Avoid biased samples |
| Interpretation | Critical thinking | Avoid false conclusions |
| Context | Domain knowledge | Understand what numbers mean |