CGM Data Sufficiency

How Much Evidence Is Enough?

A statistician sat in a clinic teaching session, listened to two CGMs being compared as if their 5/5 scores meant the same thing, then asked a small question: how big were the studies behind those scores. One was a study of three hundred people across multiple sites. The other was a study of twelve. Both scored 5/5 on the original framework, and the room had been treating them as equivalent. The data sufficiency upgrade came directly out of that question. Five questions answered yes is necessary; the size of the study answering them is what makes the answer trustable.

The loophole the upgrade closes

The original DSN Forum UK CGM Comparison Chart scored devices on five criteria. A 5/5 score meant the accuracy data had addressed each of the five risk areas. That was meaningful progress in a market where regulatory approval did not require any of them. But a loophole emerged: a device could score 5/5 with twelve people in the study; another could score 5/5 with three hundred. The label looked the same. The generalisability did not. The January 2026 framework update added a data sufficiency requirement to close that gap.

More days do not always mean more useful. Each step supports a different conversation. Sensor wear is the gating criterion.

Five criteria, plus enough people for the figures to mean something. A ±20/20 agreement of 94% in a study of twelve is not the same confidence as the same figure in a study of three hundred. Sufficiency is the second filter behind every device that earns its place in the GNL CGM Guide.

Two routes to data sufficiency

To meet the threshold, an accuracy study has to satisfy at least one of two conditions. Both aim at the same underlying property: accuracy figures stable enough to generalise.

Route A, minimum participant count

At least 50 participants in the accuracy study. The clearest route: enough people that the figures can be considered reasonably generalisable to a broader population. Most pivotal CGM trials in the cluster sit well above this floor.

Route B, high data-point density

Fewer than 50 participants, but with a very high number of paired CGM-to-reference data points per participant. Some intensive study designs achieve this with tight sensor-paired sampling and multiple sensors per person. If the total matched pairs are sufficient to produce stable accuracy statistics, a study with around 40 participants can meet the threshold.

A small study with sparse sampling produces accuracy numbers that can shift significantly with a handful of outliers. A study that meets either route produces figures that are less sensitive to which individuals happened to be in the cohort. For a device used by hundreds of thousands of people to drive insulin dosing, the evidence base needs to be big enough that the numbers describe the population, not just the sample.

Current sufficiency status by device

The five mainstream devices in the GNL CGM cluster all meet the sufficiency threshold. The two devices in the watching list are pending: peer-reviewed publication and the wider evidence base needed for confident generalisability are the gating step. Numbers below reflect published evidence as of April 2026.

Sufficiency status, mainstream devices

Dexcom G7. Met. Garg et al. 2022, n=316, 619 sensors, 77,774 matched reference pairs.

FreeStyle Libre 3 (and 3 Plus). Met. Abbott pivotal 2022 (Libre 3), n=72; Vaughan et al. 2025 meta-analysis pooling a decade of Libre studies provides the wider population layer.

Roche Accu-Chek SmartGuide. Met. Mader et al. 2024, n=48 with three sensors per participant (139 sensors analysed), three German and Austrian sites. n=48 sits at the lower bound of Route A; the three-sensor design supports the analysis.

MiniMed Simplera Sync. Met. CIP330 pivotal trial, n=243, ages 2 to 80 years, FDA submission data; peer-reviewed publication anticipated.

Senseonics Eversense 365. Met. Bailey et al. 2025 ENHANCE trial, n=110 adults, 40,497 matched YSI reference pairs.

Sufficiency status, watching list

CareSens Air (Spirit Health). Pending. CE marked; manufacturer-published 15-day accuracy data on file. Pivotal-trial publication and the wider evidence base needed to clear sufficiency are the next required step. Once peer-reviewed accuracy lands, the page is rebuilt to flagship format and the device joins the mainstream row.

GlucoMen iCan (Menarini Diagnostics). Pending. CE marked; published data is pending or under review. Same gating conditions as CareSens Air.

Why sufficiency matters clinically

CGM accuracy claims from small studies are real claims from real measurements. But sampling effects bite hardest at small numbers. If the participants happen to have more stable glucose patterns, the device looks better than it would across a broader population. If they happen to have unusually variable glucose, it looks worse. Only larger studies smooth this out. Sufficiency is a way of saying: the figure on the page is the figure you can trust to describe how the device behaves in the people who will actually use it.

Step 2 of 3