Confidence Intervals: The Most Important Number in the Paper
Confidence Intervals
A Confidence Interval (CI) is a range of values that likely contains the true effect size. Unlike a p-value, it tells you how precise the measurement actually is.
P-values are a simple "Yes/No" switch. Confidence Intervals are the story.
The Symptom: The "Significant" Nothingburger
A new study claims a "statistically significant increase" in IQ from a brain training game.
- P-value: 0.04 (Significant! Hooray!).
- Conclusion: The game works!
But how well does it work? Does it raise IQ by 20 points (genius level) or 0.1 points (meaningless)? The p-value cannot tell you. It only says "it's probably not zero."
The Mechanism: The Fishing Net
Imagine the "True Effect" of a drug is a fish swimming in a murky lake. You can't see the fish directly. You have to estimate its location by throwing a net.
The Confidence Interval is the net.
- 95% Confidence: If we repeated this experiment 100 times and threw 100 nets, 95 of them would catch the fish.
Width Matters (Precision)
The width of the net tells you how much the study actually knows.
Scenario A: The Sniper (Narrow CI)
- Result: Drug lowers blood pressure by 10 mmHg.
- 95% CI:
[9, 11] - Diagnosis: This is a precise study. We are confident the true effect is very close to 10. This is useful data.
Scenario B: The Shotgun (Wide CI)
- Result: Drug lowers blood pressure by 10 mmHg.
- 95% CI:
[1, 19] - Diagnosis: This is imprecise. The drug might work a tiny bit (1 mmHg) or a huge amount (19 mmHg). The study is "significant," but it doesn't really pin down the answer.
- Cause: Usually a Sample Size that is too small.
The Zero Line (Significance)
The CI also tells you if the result is statistically significant, just like the p-value.
- Difference: If the interval crosses 0 (e.g.,
[-2, +5]), it means the effect could be zero. Not significant. - Ratio: If the interval crosses 1 (e.g., Risk Ratio
[0.8, 1.2]), it means the risk could be the same. Not significant.
The Prescription: Read the Brackets
When you read a paper on PaperScores, ignore the p-value for a moment and look at the brackets.
- Is the range narrow? Good. The estimate is precise.
- Check the "Worst Case". Look at the end of the interval closest to zero.
- If the CI for weight loss is
[0.5 lbs, 20 lbs], the drug might only cause 0.5 lbs of weight loss. Is that worth the cost and side effects?
- If the CI for weight loss is
- Wide CI = Low Trust. A wide interval means the data is noisy. Be skeptical of the headline number.