The NNT-Detection Gap

 · 9 min read
 · Nulla Verba
Table of contents

How many people do you need to vaccinate to prevent one serious case of disease?

This number is called the Number Needed to Treat (NNT). It's the most important number for understanding whether a medical intervention is worth it. And it reveals a structural problem with how vaccine safety is evaluated.


What NNT Tells You

If a vaccine prevents one serious illness per 100 doses, the NNT is 100. If it prevents one per 10,000 doses, the NNT is 10,000.

A low NNT means the intervention helps a lot of people. A high NNT means you need to treat many people to help one.

Neither is inherently good or bad. Hand washing has a high NNT (you wash many times before preventing one infection) but the risks are negligible, so it's worth it. A treatment that prevents disease but kills 1 in 100 patients might have a low NNT and still be unacceptable.

The key insight: NNT sets the stakes for safety evidence.

If you need to vaccinate 5,000 people to prevent one death, you need trials large enough to detect if you're also causing one serious harm per 5,000 doses. Otherwise, you can't know if the vaccine does more good than harm.


The Detection Problem

Clinical trials can only detect adverse events that occur frequently enough to show up in their sample size.

A trial of 300 people can reliably detect events occurring in roughly 1 in 100 participants. It cannot detect events occurring in 1 in 1,000, because you'd expect to see zero or one such events by chance.

The Rule of 3 gives a rough guide:

Trial Size Likely to notice Easy to miss
50 1:10 (10%) Anything rarer
300 1:100 (1%) Anything rarer
3,000 1:1,000 (0.1%) Anything rarer
30,000 1:10,000 (0.01%) Anything rarer

This creates a problem. Most pre-licensure vaccine trials enroll hundreds to low thousands of participants. They're designed to detect common side effects and measure immune response. They're not designed to detect rare serious adverse events.

The problem scales with deployment. A 1-in-100,000 adverse event is invisible in a trial of 1,000. It's still invisible in a trial of 10,000. But vaccinate 10 million people and you get 100 cases. Vaccinate a billion and you get 10,000.

When you target a vaccine narrowly (high-risk groups only), rare events may never appear. When you mass-deploy to entire populations, every rare event that exists will manifest in absolute numbers. The same trial that's adequate for a targeted intervention becomes inadequate for universal policy.


A Concrete Example: Pertussis Vaccine in Pregnancy

Pregnant women in the UK are recommended to receive Tdap (tetanus, diphtheria, pertussis) vaccination to protect their newborns from whooping cough. Studies report it reduces infant pertussis cases by 89% and deaths by about 91%.2 That's the efficacy claim. But what does the safety evidence look like?

First, we need the NNT.

Using UKHSA 2024 surveillance data:1

During the 2024 outbreak (high disease incidence):

  • NNT to prevent one infant case: ~155
  • NNT to prevent one infant death: ~5,800

In a typical non-outbreak year (low disease incidence):

  • NNT to prevent one infant case: ~1,400
  • NNT to prevent one infant death: ~64,000

Now, what's the largest safety trial for this vaccine in pregnancy?

NCT02377349: 341 pregnant women received the vaccine, 346 received saline placebo. Follow-up for serious adverse events: 2 months post-delivery.

With 341 subjects, this trial can detect adverse events occurring at roughly 1 in 50 (2%). It cannot detect anything rarer.


The Gap

Scenario NNT Trial Detects Gap
Outbreak, case prevention ~155 1:50 3x
Outbreak, death prevention ~5,800 1:50 116x
Non-outbreak, case prevention ~1,400 1:50 28x
Non-outbreak, death prevention ~64,000 1:50 1,280x

The "gap" is the ratio between what the trial can detect and what it would need to detect to match the NNT.

A gap of 116x means the trial could miss adverse events up to 116 times more frequent than the death-prevention threshold, and still show "no signal."

A gap of 1,280x means the trial provides essentially no safety evidence at the relevant threshold. Any serious adverse event occurring between 1:50 and 1:64,000 would be invisible.


What This Means in Practice

In a non-outbreak year, you need to vaccinate about 64,000 pregnant women to prevent one infant death from pertussis.

Suppose there's a serious adverse event4 from the vaccine that occurs at 1 in 1,000. This is undetectable by the trial (which can only see events at 1:50 or more common).

If you vaccinate 64,000 women:

  • Deaths prevented: 1
  • Serious adverse events (at 1:1,000): 64

That's 64 serious harms for 1 death prevented. A proper analysis would weight these by severity: a death is worse than most serious adverse events. But even at 10:1 weighting, 6.4 severity-adjusted harms per death prevented is still net harm. The point isn't the exact ratio. The point is that the trial cannot rule out any rate in this range.

At UK scale (~595,000 births/year),3 the trial can rule out adverse events causing 11,900 annual cases. It cannot rule out 600, or 60, or 10. In a non-outbreak year, you're preventing about 9 deaths.


The Blind Zone

I call the range between what trials can detect and what the NNT requires the "blind zone."

For pertussis vaccine in pregnancy during non-outbreak years:

  • Detectable: Events more common than 1:50
  • NNT threshold: 1:64,000
  • Blind zone: 1:50 to 1:64,000

Any serious adverse event in this range could cause net harm, but would be invisible to the safety trials.

This doesn't mean such events exist. It means we don't have the evidence to rule them out.


This Isn't About Pertussis

I used pertussis as an example because the numbers are documented and the calculations are straightforward. But this pattern, the NNT-detection gap, applies to many vaccines.

Whenever:

  • Disease incidence is low (high NNT)
  • Pre-licensure trials are small (low detection threshold)
  • The gap between them is large

...the safety evidence cannot support claims about net benefit.

This is a structural feature of how vaccines are evaluated. Pre-licensure trials are designed to show vaccines work (immunogenicity) and to catch common side effects. They are not designed to detect rare serious adverse events at the threshold that matters for individual benefit.


A Caveat: Infectious vs Non-Infectious

NNT for infectious diseases is more complicated than for, say, blood pressure medication.

When you vaccinate against an infectious disease, you don't just protect yourself. You reduce transmission, which protects others who can't be vaccinated or for whom the vaccine fails. This is the herd immunity effect.

The NNT I calculated above measures direct individual benefit only. It doesn't capture the indirect benefit of reduced community transmission. A full accounting would lower the effective NNT.

How much lower? It depends on transmission dynamics, coverage levels, and vaccine effectiveness against transmission (not just disease). These are harder to quantify, and the data are messier.

I focus on individual NNT because it's what the safety trials can be compared against, and it's what matters for individual decision-making. The population-level efficacy case is stronger than the individual NNT suggests. But the gap analysis still applies to the safety question, which is what trials are supposed to answer for your personal risk-benefit calculation.


What "No Evidence of Harm" Actually Means

When regulators say a vaccine showed "no evidence of harm" in trials, check the trial size and the NNT.

If the trial enrolled 500 people and the NNT is 10,000, "no evidence of harm" means: "We didn't see anything in 500 people, but we couldn't have ruled out anything rarer than 1:150, and we'd need to rule out events at 1:10,000 to know if benefits outweigh harms."

That's very different from "safe."


The Regulatory Response

Health Canada's regulatory decision for Boostrix in pregnancy cites DTPA-047 (N=341) and concludes the vaccine is "generally well tolerated."

The UK's SmPC states there is "no vaccine related adverse effect on pregnancy or on the health of the foetus/newborn child."

These statements are based on trials that can detect events at 1:50. The NNT for death prevention is 1:5,800 to 1:64,000.

The gap is 116x to 1,280x.

What about post-marketing surveillance? Can't we learn about rare events after millions of doses? Yes, and post-marketing data can be powerful, especially in active surveillance systems. But it's better at detecting big signals than ruling out small ones. A future post will examine what post-marketing surveillance and large observational studies can and cannot tell us. For now, the point stands: at the moment of licensure, the cleanest causal safety evidence is what trials provide.


What Should We Conclude?

I'm not saying the pertussis vaccine is harmful. I don't know if it is.

I'm saying the evidence cannot tell us whether it does more good than harm for the outcome that matters most (preventing death) in most years (non-outbreak).

For outbreak years and case prevention, the gap is smaller (3x). The evidence provides reasonable, though not complete, reassurance.

For death prevention, especially in non-outbreak years, we're operating on assumption, not evidence.


The Question to Ask

For any vaccine, ask:

  1. What's the NNT? How many people need to be vaccinated to prevent one serious outcome?
  2. What's the largest safety trial? How many subjects, and what could it detect?
  3. What's the gap? Is the trial large enough to detect harms at the NNT threshold?

If the gap is large, "no evidence of harm" tells you very little.

If you spot an error in my reasoning, data, or sources, tell me. I'll correct it publicly.


Next post: What "no evidence of harm" actually means, and the four different things that phrase can refer to.


  1. NNT calculation methodology: Observed 2024 disease rates come from a population already ~64% vaccinated. To get NNT, you need the unvaccinated baseline: Baseline = Observed / (1 - Coverage × VE). Then NNT = 1 / (Baseline × VE). This accounts for the fact that vaccination has already reduced disease incidence from what it would otherwise be. 

  2. The 91% figure for death prevention is unpublished UKHSA data cited in their 2024 surveillance report. The 89% figure for case prevention comes from Amirthalingam 2023, a peer-reviewed study. 

  3. England and Wales 2024: 594,677 live births (ONS). At 1:50 detection threshold: 595,000 ÷ 50 = 11,900 cases ruled out. At 1:1,000: 595 cases. At 1:10,000: 60 cases. Deaths prevented in non-outbreak year: 595,000 ÷ 64,000 NNT ≈ 9. 

  4. "Serious adverse event" (SAE) is a regulatory category, not a severity category. It typically means any event requiring ≥24 hours hospitalization, regardless of long-term outcome. Many conditions people would consider serious (e.g., some neurological injuries) may not meet this threshold. The term can obscure what's actually being measured.