How to Read a Research Study, Part 2 of 7

The A-to-D evidence grades

How Grace classifies every clinical claim, why each tier earns the qualifier it carries, and why Grade C is the tier most often confused with Grade A.

Ask Grace

Want to ask Grace what grade a specific claim sits at? Ask Grace.

The taxonomy in one picture

GNL uses an A-to-D taxonomy across all clinical content. Grade flows from study design plus risk of bias plus replication. The pyramid below is the at-a-glance version; the four sections that follow set out what sits in each tier and the qualifier Grace surfaces alongside any claim drawn from that tier.

Source: Cochrane Handbook v6.5 (2024) Chapter I, plus the GRADE Working Group Handbook. The qualifier on the right is the language Grace surfaces with every claim at that tier.

Grade A, the load-bearing tier

Three categories of evidence earn the top tier:

Well-conducted systematic reviews with low risk of bias across included studies (Cochrane Handbook methodology). Multiple independent trials, network-meta-analytic confidence intervals that do not cross the line.
Pivotal randomised controlled trials independently replicated, registered before enrolment, reported per CONSORT, with effect sizes that survive sensitivity analysis.
International consensus guidelines (ADA, EASD, NICE, ISPAD, IDF) that synthesise the above with named methodology.

Grade A claims are what Grace uses without caveat.

“Systematic reviews seek to collate evidence that fits pre-specified eligibility criteria in order to answer a specific research question. They aim to minimize bias by using explicit, systematic methods documented in advance with a protocol.”
Source: Cochrane Handbook for Systematic Reviews of Interventions v6.5 (2024), Chapter I, Key Points. The “pre-specified” and “documented in advance” clauses are load-bearing; a “systematic review” without a registered protocol is a literature review with delusions of grandeur.

Grade B, the working tier

Two categories of evidence sit one rung down from the top:

Single well-conducted RCTs before independent replication.
Large prospective cohort studies with pre-specified hypotheses and low loss to follow-up.

Grade B claims earn an explicit single-trial qualifier (“the trial showed X, replication pending”). Grace surfaces them with that qualifier always.

Grade C, the qualified tier

Four categories of evidence sit at the qualified tier:

Retrospective cohort and case-control studies with adjustment for known confounders.
Mechanistic studies in humans that establish the why but do not quantify the clinical effect.
Real-world evidence without comparator, useful for hypothesis generation, not for recommendation.
Industry pilot or pivotal trials that have NOT been independently replicated. This is the category most often confused with Grade A.

Grade C claims earn the line “evidence base is moderate; recommendation should be made jointly with the diabetes care team”. Grace uses Grade C claims for context, never for action.

Grade C is the category most often confused with Grade A. Industry pilot trials live here, not at the top of the pyramid.

Grade D, the educated-opinion tier

Four categories of evidence sit at the educated-opinion tier:

Expert opinion without underlying systematic synthesis.
Case series and case reports.
Mechanistic studies in animal models or in vitro pending human evidence.
Educational synthesis (the AID Optimiser ladder, for instance, is a Grade D synthesis on a Grade A/B evidence base).

Grade D claims are always paired with the line that they are educational only, that the underlying evidence has not yet quantified the effect for a person, and that the diabetes care team is the right place to take the decision.

Part 2 of 7

The A-to-D evidence grades

The A-to-D evidence grades

The taxonomy in one picture

Grade A, the load-bearing tier

Grade B, the working tier

Grade C, the qualified tier

Grade D, the educated-opinion tier

Read more on GNL