\n
A new cancer drug reduces mortality by 50% — sounds dramatic. But if the baseline risk was 2%, the absolute reduction is only 1 percentage point. Understanding the difference between relative and absolute risk is what separates critical reading of medical evidence from being misled by statistics.
Use the PDF for classwork, homework or revision. It includes key ideas, activities, questions, an extend task and success-criteria proof.
Kidney Disease
A newspaper headline reads: "New cholesterol drug cuts heart attack risk by 40%." The drug costs $200/month and has moderate side effects in 10% of users.
The actual trial data: in the placebo group, 5 out of every 1,000 patients had a heart attack over 5 years. In the drug group, 3 out of every 1,000 patients had a heart attack over 5 years.
Before reading on:
Q1: The 40% figure is the relative risk reduction. Calculate the absolute risk reduction (the actual difference in risk between the two groups). How does the absolute figure compare to the relative figure in terms of what it means for an individual patient?
Q2: If 1,000 patients take this drug for 5 years, how many heart attacks are prevented? How does this change your assessment of the drug's value?
Connect this concept back to the broader homeostasis and disease framework you have built across the course.
Try this: Enter values into the 2×2 table or load an example, then observe how RR, attributable risk, and odds ratio change. Try RR = 1, RR > 1, and RR < 1 to see how the interpretation shifts.
This calculator demonstrates why relative risk alone can be misleading without absolute risk context.
Relative Risk tells you how many times more likely the exposed group is to develop the outcome. RR = 1 means no association. RR > 1 indicates increased risk. RR < 1 indicates protection. Always interpret RR alongside absolute risk reduction and NNT to understand real-world clinical significance.
Relative risk tells you how much more or less likely an outcome is in one group compared to another — expressed as a ratio. Absolute risk reduction tells you the actual size of that difference in real-world terms. Number needed to treat translates that difference into a clinically meaningful statement about how many patients benefit. All three are needed to evaluate a treatment honestly.
Epidemiology showing study types, measures and evaluation
Bradford Hill criteria for establishing causation
An RCT follows 10,000 patients (5,000 statin, 5,000 placebo) for 5 years. Results:
Statin group: 100 heart attacks out of 5,000 = risk of 0.02 (2%)
Placebo group: 150 heart attacks out of 5,000 = risk of 0.03 (3%)
RR = 0.02 ÷ 0.03 = 0.67 — the statin group has 67% of the risk of the placebo group (a 33% lower relative risk).
ARR = 0.03 − 0.02 = 0.01 (1%) — the statin reduces absolute heart attack risk by 1 percentage point over 5 years.
RRR = 0.01 ÷ 0.03 = 33% — the statin reduces relative risk by 33%.
NNT = 1 ÷ 0.01 = 100 — 100 patients must take the statin for 5 years to prevent one additional heart attack.
Interpretation: A headline saying "statins reduce heart attack risk by 33%" is technically accurate (RRR) but can be misleading — the absolute reduction is only 1%. Whether NNT = 100 is acceptable depends on the drug's cost, side effects, and the severity of the outcome prevented. For a condition as serious as heart attack, NNT = 100 may well be worthwhile. For a minor condition, it may not be.
Relative risk amplifies small effects in low-risk populations. "This supplement reduces cancer risk by 50%" sounds impressive — but if the baseline risk is 0.002% (2 in 100,000), a 50% reduction means an absolute reduction from 0.002% to 0.001%. NNT would be 100,000 — you would need to treat 100,000 people to prevent one cancer. The relative figure is truthful but decontextualised from real-world importance.
Conversely, an ARR of 5% (NNT = 20) represents a very effective treatment — for every 20 people treated, one extra bad outcome is prevented. In clinical medicine, NNT values below 10 are considered highly effective; 10–100 moderately effective; above 100 marginal.
A survival curve (Kaplan-Meier plot) shows the proportion of a study population that has not yet experienced the primary outcome (often death, but also disease recurrence, hospitalisation, or other events) over time. They appear in almost every major clinical trial and in many HSC exam questions about epidemiological data.
A melanoma trial shows two survival curves over 5 years. The immunotherapy group starts at 1.0 and falls gradually to 0.52 (52% surviving at 5 years). The chemotherapy group starts at 1.0 and falls more steeply to 0.28 (28% surviving at 5 years). The curves diverge from 6 months onward.
What can you conclude:
At 5 years, 52% of immunotherapy patients were still alive vs 28% of chemotherapy patients — a difference of 24 percentage points (absolute difference in 5-year survival).
The curves diverge from 6 months — suggesting immunotherapy benefit begins early and increases over time. This divergence pattern is consistent with immunotherapy's mechanism (stimulating durable immune responses that continue killing cancer cells).
You cannot conclude that all immunotherapy patients will survive long-term — the curves show 48% of immunotherapy patients also died within 5 years. What the study shows is that immunotherapy more than doubled the proportion surviving at 5 years compared to chemotherapy.
Try this: Select each study description card, then place it in the correct study type bin. Check your answers when all six cards are placed.
Recognising study designs from their description is an essential HSC skill tested directly in exam questions.
Cohort studies follow exposed groups forward in time. Case-control studies compare past exposures between cases and controls. Cross-sectional studies measure exposure and disease at one time point. Recognising these designs from their description is essential for evaluating epidemiological evidence.
In medicine and public health, evidence is graded by quality. Evidence from a single patient case report is informative but cannot establish general truths. Evidence from a well-conducted systematic review of dozens of RCTs provides the most reliable basis for clinical decisions. Understanding this hierarchy allows you to evaluate claims critically — and to recognise when media reports cherry-pick weak evidence to make strong claims.
| Level | Study type | Strength | Limitation | Example |
|---|---|---|---|---|
| 1 (strongest) | Systematic review and meta-analysis of RCTs | Pools results of multiple high-quality trials; greatest statistical power; controls for individual study quirks | Quality depends on quality of included studies; publication bias can distort results | Cochrane review of statin trials |
| 2 | Single well-designed RCT | Randomisation controls confounders; establishes causation | May not generalise to all populations; can be underpowered | UKPDS trial — metformin for Type 2 diabetes |
| 3 | Cohort study | Prospective; establishes temporal sequence; large populations possible | Observational — cannot control all confounders | Nurses' Health Study — diet and cancer |
| 4 | Case-control study | Efficient for rare diseases; retrospective | Recall bias; cannot establish incidence | Case-control study of HPV and cervical cancer |
| 5 | Cross-sectional study | Cheap; generates hypotheses | Cannot establish temporal sequence | National Health Survey — diet and diabetes prevalence |
| 6 (weakest) | Case report / expert opinion | Identifies novel phenomena; hypothesis-generating | No comparison group; no statistical analysis; highly susceptible to bias | "Patient who ate X recovered from Y" |
A p-value below 0.05 (the conventional threshold for 'statistical significance') means there is less than a 5% probability of observing the result by chance if the null hypothesis (no effect) were true. It does NOT mean the effect is clinically important. With very large sample sizes, even tiny trivial differences become statistically significant.
Example: A study of 500,000 patients finds that a new drug reduces blood pressure by an average of 0.3 mmHg compared to placebo (p = 0.001 — highly statistically significant). A 0.3 mmHg reduction in blood pressure is clinically meaningless — no patient would benefit detectably from such a small change. The study found a real effect, but not a useful one. Statistical significance tells you whether an effect exists; clinical significance (effect size, ARR, NNT) tells you whether it matters.
The HSC regularly asks students to evaluate study quality. This is not about finding flaws for the sake of it — it is about identifying what a study can and cannot establish, so that claims based on its results can be appropriately qualified.
Study: A 6-week RCT of 200 patients found that a new anti-inflammatory drug reduced self-reported knee pain by 35% more than placebo (p = 0.03). The study was single-blind (patients did not know which group they were in, but researchers did). Patients with severe kidney disease were excluded.
Strengths: RCT design — randomisation controls for most confounders. Appropriate study design for testing a new treatment.
Limitations to note: (1) Single-blind — researchers who knew treatment allocation could unconsciously bias their assessments of patient-reported pain (assessment bias). Double-blinding would be stronger. (2) 6 weeks is short — many musculoskeletal conditions improve spontaneously over 6 weeks (regression to the mean). A longer trial would be more convincing. (3) Self-reported pain is subjective — placebo effect is substantial for pain outcomes even with blinding. (4) Excluded severe kidney disease patients — results may not generalise to this group who may have different drug metabolism. (5) p = 0.03 is statistically significant but close to the threshold — with a small sample (200), there is more risk this reflects sampling variation.
The Heart Protection Study (HPS), published in 2002, was one of the largest cardiovascular trials ever conducted — 20,536 patients with existing cardiovascular disease or high risk, followed for 5 years. It found that simvastatin reduced major vascular events (heart attacks, strokes, revascularisation procedures) by about 24% relative risk reduction compared to placebo.
The headline figure — 24% relative risk reduction — was used extensively to promote statin prescribing. But the absolute figures were equally important: the event rate fell from approximately 25.2% in the placebo group to 19.8% in the statin group — an ARR of 5.4 percentage points, giving an NNT of approximately 19 over 5 years. This means treating 19 high-risk patients with simvastatin for 5 years prevents one additional major vascular event.
For high-risk patients with existing cardiovascular disease, NNT = 19 is considered highly clinically significant — statins rapidly became standard of care for this group. But when the same relative risk reduction (24%) was applied to lower-risk primary prevention populations (people without existing CVD), the absolute event rate in the placebo group was much lower (~5% over 5 years), producing an ARR of only ~1.2% and an NNT of ~83. The same drug, the same relative risk reduction, but very different absolute benefit — which is why prescribing decisions for primary prevention are more nuanced than for secondary prevention. This is precisely why NNT matters.
"A large relative risk reduction means the treatment is very effective." — Relative risk reduction must always be considered alongside the absolute baseline risk. A 50% relative risk reduction from 0.002% to 0.001% (ARR = 0.001%, NNT = 100,000) is far less clinically meaningful than a 25% relative risk reduction from 20% to 15% (ARR = 5%, NNT = 20).
"A p-value below 0.05 means the result is important." — Statistical significance (p < 0.05) means the result is unlikely to be due to chance. It says nothing about clinical importance. With large enough samples, trivially small and meaningless differences become statistically significant. Always assess clinical significance (effect size, ARR, NNT) alongside statistical significance.
"A survival curve that falls to zero means all patients died." — A survival curve that reaches zero means all patients in the study eventually experienced the primary outcome within the follow-up period. However, studies usually end before all participants experience the outcome — a curve that plateaus does not mean those patients are cured; it means the follow-up period ended. Censoring marks indicate patients who left the study before the outcome — their fate is unknown, not assumed to be survival.
"A systematic review is just a literature review." — A systematic review uses pre-specified, reproducible methods to identify and critically appraise ALL relevant studies on a question, minimising selection bias in which studies are included. A narrative (regular) literature review is selective — the author chooses which studies to discuss, which can introduce bias. Meta-analysis within a systematic review statistically pools study results. These methodological distinctions place systematic reviews at the top of the evidence hierarchy.
"If a study found no effect (p > 0.05), the treatment definitely doesn't work." — A non-significant result means there was insufficient evidence to reject the null hypothesis — not that the null hypothesis is true. A small study may be underpowered to detect a real but modest effect. The absence of evidence is not evidence of absence. Confidence intervals around the null result tell you more than the p-value alone — a wide confidence interval crossing zero indicates uncertainty, not definitive null effect.
Image Slot 1: Side-by-side comparison of relative risk reduction vs absolute risk reduction using two scenarios: (A) High-risk population — 20% vs 15% event rate — ARR = 5%, NNT = 20; (B) Low-risk population — 0.4% vs 0.3% event rate — ARR = 0.1%, NNT = 1000. Both have the same RRR = 25%. Visual should show the same headline ("25% risk reduction") but starkly different real-world impact.
Image Slot 2: Annotated Kaplan-Meier survival curve with labels for: y-axis (proportion event-free), x-axis (time in months/years), two diverging lines (treatment vs control), the gap between lines at specific time points (annotated with values), censoring tick marks, and a shaded region showing the difference in 5-year survival. Arrows pointing to key features with explanatory notes.
1 A clinical trial tests a new Type 2 diabetes drug in 2,000 patients. After 3 years:
| Group | Patients | Progressed to T2D |
|---|---|---|
| Drug group | 1,000 | 60 |
| Placebo group | 1,000 | 100 |
Calculate: (a) Risk in each group; (b) Relative Risk (RR); (c) Absolute Risk Reduction (ARR); (d) Relative Risk Reduction (RRR); (e) Number Needed to Treat (NNT). Then interpret what NNT means in plain language.
2 A melanoma immunotherapy trial produces the following 5-year survival data: Immunotherapy group — 55% surviving at 5 years. Chemotherapy group — 25% surviving at 5 years. The curves diverge from month 4 and continue to separate throughout follow-up. (a) Calculate the absolute difference in 5-year survival. (b) Interpret what the diverging curves suggest about how immunotherapy works over time. (c) State two limitations of this data that prevent you from concluding immunotherapy cures melanoma.
A pharmaceutical company conducts a 12-week RCT of a new antidepressant. 120 patients with moderate depression are recruited from a single clinic. 60 receive the drug, 60 receive placebo. The study is single-blind (patients do not know which group they are in, but clinicians do). The primary outcome is a 50% reduction in a self-reported depression scale score. Results: 42% of drug patients achieved the primary outcome vs 28% of placebo patients (p = 0.04). The company concludes: "This drug is significantly more effective than placebo for treating depression."
(a) Calculate the ARR and NNT for this study. (b) Identify two methodological limitations and explain how each could affect the conclusions. (c) Evaluate whether the company's conclusion is fully justified by the data. (d) What would need to happen before this drug could be recommended for widespread prescribing?
1. In a vaccine trial, 2% of the unvaccinated group developed the disease compared to 0.5% of the vaccinated group. What is the Number Needed to Vaccinate (equivalent to NNT) to prevent one case?
2. A Kaplan-Meier survival curve shows two lines that converge after year 3, having diverged in years 1–3. What does the convergence most likely indicate?
3. A very large study of 1 million people finds that people who drink 3+ cups of coffee per day have a statistically significantly lower rate of Type 2 diabetes (p = 0.001, RR = 0.97). The ARR is 0.3%. Which statement best evaluates this finding?
4. Why are systematic reviews and meta-analyses considered the highest level of evidence for treatment efficacy?
5. A 4-week trial of a new pain medication finds a statistically significant reduction in pain scores (p = 0.02, NNT = 8). However, the trial was single-blind, recruited only 80 patients from one hospital, excluded patients over 70, and had a 25% dropout rate. A doctor argues: "The NNT of 8 is excellent — I should prescribe this drug." Evaluate this reasoning.
6. A newspaper headline reads: "New cancer drug slashes tumour recurrence by 45%." The underlying trial data shows: recurrence rate in placebo group = 20%; recurrence rate in drug group = 11%. Calculate the absolute risk reduction and NNT for this drug. Then explain why the headline's "45%" figure, while mathematically accurate, could mislead a patient trying to understand their personal benefit from the drug. 4 MARKS
7. A researcher presents survival curve data from a lung cancer trial showing that a new targeted therapy group has significantly better 3-year survival than standard chemotherapy (60% vs 35%, p < 0.001). A colleague argues: "These results prove the targeted therapy should immediately replace chemotherapy for all lung cancer patients." Evaluate this claim by discussing what the survival curve data does and does not show, and what additional information is needed before making the recommendation. 5 MARKS
8. "A single well-designed RCT showing a positive result is sufficient to change clinical practice." Evaluate this claim by discussing the strengths and limitations of individual RCTs, the role of replication and systematic review, and when it might be appropriate to act on a single trial versus waiting for more evidence. 6 MARKS
Return to your Think First responses at the start of this lesson.
1. T2D drug trial. (a) Risk (drug) = 60/1000 = 0.06 (6%); Risk (placebo) = 100/1000 = 0.10 (10%). (b) RR = 0.06 ÷ 0.10 = 0.60. The drug group has 60% of the risk of the placebo group — i.e. 40% lower relative risk. (c) ARR = 0.10 − 0.06 = 0.04 (4%). (d) RRR = 0.04 ÷ 0.10 = 0.40 = 40%. (e) NNT = 1 ÷ 0.04 = 25. Plain language: to prevent one extra person progressing to Type 2 diabetes over 3 years, 25 patients with insulin resistance must take this drug for 3 years. Whether this is clinically worthwhile depends on the drug's cost, side effects, and the severity/cost of Type 2 diabetes if it develops. Given the serious long-term complications of T2D (blindness, kidney failure, cardiovascular disease), NNT = 25 over 3 years could well be considered clinically meaningful.
2. Melanoma survival curve. (a) Absolute difference in 5-year survival = 55% − 25% = 30 percentage points — 30% more patients in the immunotherapy group were alive at 5 years compared to the chemotherapy group. (b) The curves diverge from month 4 and continue to separate throughout follow-up. This diverging pattern (rather than parallel or converging) suggests the treatment benefit grows over time — consistent with immunotherapy's mechanism: it stimulates the patient's own immune system to recognise and kill melanoma cells. This immune response, once established, can continue killing cancer cells and providing durable disease control even after the treatment has been given. This contrasts with chemotherapy, which kills cancer cells directly but does not establish immunological memory. (c) Limitation 1: the follow-up is only 5 years — we cannot conclude whether immunotherapy confers long-term or permanent benefit; curves may converge after 5 years as patients relapse. Limitation 2: 45% of immunotherapy patients also died within 5 years — the drug clearly does not 'cure' all patients; it prolongs survival for a proportion. The study shows improved survival probability, not a cure. Additionally: no information about side effects, patient selection criteria, or whether results apply to all melanoma subtypes (e.g. BRAF-mutated vs non-mutated).
Antidepressant trial evaluation. (a) ARR = 42% − 28% = 14%; NNT = 1 ÷ 0.14 ≈ 7. For every 7 patients treated with this drug, one extra patient achieves a meaningful reduction in depression symptoms compared to placebo. NNT = 7 is clinically meaningful for a condition as disabling as depression. (b) Limitation 1 — single-blind design: clinicians who know which patients are in the drug group may unconsciously rate those patients more favourably on depression scales (assessment bias). For a subjective outcome like depression (assessed by clinician interview or self-report), this is a significant limitation. Double-blinding (where both patients and assessors are unaware of treatment allocation) would produce more reliable results. Limitation 2 — small sample from one clinic: 120 patients from a single clinic is a relatively small, geographically limited sample. The patient population at one clinic may not be representative of all patients with moderate depression — different age, ethnicity, comorbidity, and medication history profiles. Single-site studies are also more susceptible to local biases in patient selection and management. (c) Evaluation of the company's conclusion: the conclusion is partially justified but overstated. p = 0.04 is statistically significant and NNT = 7 is clinically meaningful — there is reasonable evidence of a real short-term effect. However, 'significantly more effective' is too broad a claim based on this single study because: (1) the single-blind design introduces potential bias in outcome measurement; (2) 12 weeks is a short follow-up — depression is often a long-term condition and short trials may capture response to treatment or natural fluctuation; (3) the trial compared drug vs placebo — it did not compare with existing antidepressants, so relative advantage over standard treatment is unknown; (4) the 25% dropout rate raises the question of why people dropped out (side effects? lack of efficacy? both would bias results favourably if not analysed by intention-to-treat). (d) Before widespread prescribing: a larger (500+ patient), multi-site, double-blind RCT with longer follow-up (6–12 months minimum); comparison against existing first-line antidepressants (not just placebo); independent replication by researchers without financial ties to the company; safety data on longer-term use; intention-to-treat analysis accounting for all dropouts; ideally inclusion in a systematic review or meta-analysis.
1. C — ARR = 2% − 0.5% = 1.5% = 0.015. NNT = 1 ÷ 0.015 = 66.7 ≈ 67. Option A incorrectly uses the relative risk ratio directly. Option B uses the RRR (75%) as if it were the NNT denominator. Option D uses an incorrect calculation.
2. B — Converging curves after year 3 mean the survival advantage established in years 1–3 is diminishing — both groups are experiencing similar event rates after year 3. This could mean the treatment benefit doesn't persist, or responders have already been identified and those remaining have similar prognosis regardless of treatment. Option A is wrong — convergence doesn't necessarily mean harm. Option C is wrong — convergence means similar rates, not zero survival. Option D is wrong — convergence is a data observation, not an extrapolation artefact.
3. D — RR = 0.97 = only a 3% relative risk reduction. ARR = 0.3%, NNT ≈ 333. The p = 0.001 result is almost certainly driven by the massive sample size (1 million) detecting a trivially small real effect. Clinical significance is negligible. Option A mistakes statistical significance for clinical importance. Option B ignores the tiny effect size and observational design. Option C misinterprets RR close to 1.0 — this doesn't disprove a link but shows the link, if real, is tiny.
4. A — Systematic reviews use pre-specified methods to minimise selection bias and pool results for greater power. Option B is wrong — researcher experience is not the basis for the hierarchy. Option C is wrong — systematic reviews typically focus on high-quality studies and assess quality explicitly. Option D is wrong — large combined samples increase power but don't guarantee significance, and the value is in quality not just size.
5. B — NNT = 8 is promising but the trial limitations (single-blind, 80 patients, one site, exclusion of over-70s, 25% dropout) substantially reduce confidence in the result. The doctor's reasoning acknowledges the NNT but ignores the methodological caveats. Option A ignores the limitations. Option C is wrong — 4-week trials can produce valid evidence for acute conditions. Option D overstates the impact of dropout — it is a limitation, not an automatic invalidation, especially if an intention-to-treat analysis was done.
Q6 (4 marks): ARR = 20% − 11% = 9 percentage points (0.09) [1 mark]. NNT = 1 ÷ 0.09 = 11.1 ≈ 11. For every 11 patients treated with this drug, one extra tumour recurrence is prevented [1 mark]. Verification of headline: RRR = 9% ÷ 20% = 45% — confirming the headline figure is the relative risk reduction. Why the headline misleads: the 45% figure is a relative measure — it expresses the reduction as a proportion of the baseline risk (20%). A patient reading "slashes risk by 45%" is likely to interpret this as their personal risk dropping by 45 percentage points — e.g. from 20% to near zero — which is incorrect. The actual personal risk drops from 20% to 11% — a 9 percentage point absolute reduction [1 mark]. The NNT of 11 means that out of 11 patients treated, 10 experience the same outcome regardless of treatment. Only 1 in 11 patients benefits specifically from the drug vs placebo. This is still a clinically useful NNT — but the patient's understanding of benefit is accurately framed as "an 11 in 100 chance rather than 20 in 100" not "45% better" [1 mark — 4 marks total].
Q7 (5 marks): What the survival curve data shows: at 3 years, 60% of targeted therapy patients remained alive compared to 35% of chemotherapy patients — an absolute difference of 25 percentage points. This is both statistically significant (p < 0.001) and clinically meaningful. The targeted therapy approximately doubled the proportion surviving to 3 years, which is a substantial improvement for a disease as serious as lung cancer [1 mark]. What the data does NOT show: (1) Survival beyond 3 years — the trial follow-up is 3 years; we do not know if the curves converge later, whether all responders eventually relapse, or what the 5-year survival rates are. (2) The side effect and quality-of-life profile of the targeted therapy — a treatment with dramatically better survival but severe chronic toxicity may not be preferable for all patients. (3) Whether these results apply to all lung cancer patients — targeted therapies (e.g. EGFR inhibitors, ALK inhibitors) typically only benefit patients whose tumours carry specific genetic mutations. The trial may have enrolled a mutation-selected population, making results non-generalisable to unselected patients [2 marks]. Additional information needed: (1) mutation profiling — which patients carry the targetable mutation; (2) longer follow-up data (5-year, 10-year survival curves); (3) toxicity data to compare quality-adjusted survival; (4) cost-effectiveness data — targeted therapies are typically very expensive; (5) head-to-head comparison in both mutation-positive and mutation-negative populations to define which patients actually benefit [1 mark]. Conclusion: the claim that this therapy should "immediately replace chemotherapy for all lung cancer patients" is not supported — the 3-year survival data is promising and justifies prioritising this therapy for mutation-positive patients, but widespread adoption across all patients requires evidence of benefit in unselected populations, longer-term data, and consideration of toxicity and cost [1 mark — 5 marks total].
Q8 (6 marks): Strengths of individual RCTs: randomisation distributes known and unknown confounders equally, establishing that any observed difference between groups is due to the intervention rather than pre-existing differences. Double-blinding reduces both performance bias (participants behaving differently if they know their allocation) and detection bias (assessors rating outcomes differently). A well-powered, well-designed RCT with a pre-specified primary outcome is the strongest single study for establishing efficacy and causation of benefit [1.5 marks]. Limitations of individual RCTs: (1) Chance variation — even a well-designed RCT has a ~5% probability of a false positive result (the p < 0.05 threshold by definition accepts 5% of false positives). A single positive trial may reflect sampling variation. (2) Publication bias — negative RCTs are less frequently published; the literature may systematically overestimate treatment efficacy if only positive trials are reported. (3) Limited generalisability — trials often use narrow patient eligibility criteria (excluding elderly, multi-morbid, pregnant patients) that may not reflect real-world prescribing populations. (4) Potentially underpowered — a small RCT may produce a statistically significant result in a single subgroup by chance [2 marks]. Role of systematic review and replication: systematic reviews pool results from multiple independent RCTs, dramatically increasing statistical power to detect true effects while averaging out chance findings in individual trials. Pre-specified inclusion criteria minimise selection bias. Publication bias assessment (e.g. funnel plots) partially addresses the overestimation of positive results. If an effect is genuine, it should appear consistently across multiple trials in different populations — consistency is a key Bradford Hill criterion for causation [1.5 marks]. When single trial may justify action vs when more evidence is needed: a single RCT may appropriately change practice when: the disease is severe and life-threatening with no existing effective treatment; the trial is large, well-powered, double-blinded, and shows a very large effect size (NNT very low); the biological mechanism is clearly understood; and the result is biologically plausible. It may be appropriate to wait for replication when: existing effective treatments are available; the effect size is modest; the trial population is narrow; there are concerns about funding bias; or the disease is relatively minor and the drug has significant side effects [1 mark — 6 marks total].
Defend your ship by blasting the correct answers for Epidemiology — Data Analysis, Treatment Outcomes and Study Evaluation. Scores count toward the Asteroid Blaster leaderboard.
Play Asteroid Blaster →Tick when you have finished all activities and checked your answers.