How to Read a Research Study, Part 2 of 7
The A-to-D evidence grades
How Grace classifies every clinical claim, why each tier earns the qualifier it carries, and why Grade C is the tier most often confused with Grade A.
Ask Grace
Want to ask Grace what grade a specific claim sits at? Ask Grace.
The taxonomy in one picture
GNL uses an A-to-D taxonomy across all clinical content. Grade flows from study design plus risk of bias plus replication. The pyramid below is the at-a-glance version; the four sections that follow set out what sits in each tier and the qualifier Grace surfaces alongside any claim drawn from that tier.
Grade A, the load-bearing tier
Three categories of evidence earn the top tier:
- Well-conducted systematic reviews with low risk of bias across included studies (Cochrane Handbook methodology). Multiple independent trials, network-meta-analytic confidence intervals that do not cross the line.
- Pivotal randomised controlled trials independently replicated, registered before enrolment, reported per CONSORT, with effect sizes that survive sensitivity analysis.
- International consensus guidelines (ADA, EASD, NICE, ISPAD, IDF) that synthesise the above with named methodology.
Grade A claims are what Grace uses without caveat.
“Systematic reviews seek to collate evidence that fits pre-specified eligibility criteria in order to answer a specific research question. They aim to minimize bias by using explicit, systematic methods documented in advance with a protocol.”
Source: Cochrane Handbook for Systematic Reviews of Interventions v6.5 (2024), Chapter I, Key Points. The “pre-specified” and “documented in advance” clauses are load-bearing; a “systematic review” without a registered protocol is a literature review with delusions of grandeur.
Grade B, the working tier
Two categories of evidence sit one rung down from the top:
- Single well-conducted RCTs before independent replication.
- Large prospective cohort studies with pre-specified hypotheses and low loss to follow-up.
Grade B claims earn an explicit single-trial qualifier (“the trial showed X, replication pending”). Grace surfaces them with that qualifier always.
Grade C, the qualified tier
Four categories of evidence sit at the qualified tier:
- Retrospective cohort and case-control studies with adjustment for known confounders.
- Mechanistic studies in humans that establish the why but do not quantify the clinical effect.
- Real-world evidence without comparator, useful for hypothesis generation, not for recommendation.
- Industry pilot or pivotal trials that have NOT been independently replicated. This is the category most often confused with Grade A.
Grade C claims earn the line “evidence base is moderate; recommendation should be made jointly with the diabetes care team”. Grace uses Grade C claims for context, never for action.
Grade C is the category most often confused with Grade A. Industry pilot trials live here, not at the top of the pyramid.
Grade D, the educated-opinion tier
Four categories of evidence sit at the educated-opinion tier:
- Expert opinion without underlying systematic synthesis.
- Case series and case reports.
- Mechanistic studies in animal models or in vitro pending human evidence.
- Educational synthesis (the AID Optimiser ladder, for instance, is a Grade D synthesis on a Grade A/B evidence base).
Grade D claims are always paired with the line that they are educational only, that the underlying evidence has not yet quantified the effect for a person, and that the diabetes care team is the right place to take the decision.
Part 2 of 7
The A-to-D evidence grades
