Mean Absolute Relative DifferenceEdit
Mean Absolute Relative Difference (MARD) is a statistical metric used to evaluate the accuracy of glucose-monitoring technology, especially continuous glucose monitors. It represents the average of the absolute differences between CGM readings and a reference measurement, relative to the reference value, typically expressed as a percentage. A lower MARD indicates closer agreement with the reference and, all else equal, greater trust in readings for making day-to-day decisions about care.
In practice, MARD is a widely used yardstick for comparing device performance, calibrations, and algorithms across studies and product generations. It sits alongside other clinical and analytical measures—such as time in range and regulatory benchmarks like ISO 15197—as part of a broader toolkit for assessing how well a CGM performs in real life. The discussion below explains what MARD captures, what it does not, and the debates surrounding its use from a market-oriented, innovation-friendly perspective.
Definition and calculation
Calculation
MARD is computed by taking the mean of the absolute relative differences between CGM readings and corresponding reference measurements. A common formulation is: MARD = (1/n) * sum over i of |(CGM_i − Ref_i) / Ref_i| × 100% where CGM_i is the i-th CGM reading, Ref_i is the paired reference measurement, and n is the number of paired observations. The reference is typically a lab-grade or clinic-grade measurement of blood glucose, such as venous plasma glucose or a calibrated point-of-care method, collected contemporaneously with CGM readings.
Interpretation
Because MARD is a population-level statistic expressed as a percentage, smaller values imply that, on average, CGM readings track the reference more closely. In practice, reported MARD values for modern CGMs commonly fall into the single-digit to low-double-digit percentages, depending on device, study protocol, and whether calibration is used. However, a numeric comparison alone does not reveal how errors are distributed across the measurement range or how they could affect clinical decisions in real life. To that end, researchers supplement MARD with other analyses such as Clarke Error Grid or the more recent Parkes Error Grid to understand the clinical impact of errors.
Limitations and criticisms
What MARD does and does not capture
- What it captures: the average relative error magnitude across paired data, giving a straightforward sense of overall agreement with the reference.
- What it misses: the distribution of errors (e.g., whether errors cluster at clinically important ranges), the direction of bias, and whether errors would lead to mismanagement of hypoglycemia or hyperglycemia in real-world use. MARD alone may not reflect how an error translates into time-in-range outcomes or into patient risk.
Practical concerns
- Sensitivity to outliers: a few large measurement mismatches can disproportionately affect MARD, even if most readings are accurate.
- Dependence on the reference method: differences in reference measurement technique, timing, and sample handling influence MARD and can complicate cross-study comparisons.
- Range and context: device performance can vary across glucose ranges (hypoglycemia, normoglycemia, hyperglycemia) and across user populations (adults vs. children) in ways that MARD alone may not reveal.
Controversies and debates
Clinically meaningful versus purely numerical accuracy
Critics note that two devices with similar MARD values can perform differently in clinically important situations. To address this, evaluators increasingly pair MARD with clinical risk assessments like the Clarke or Parkes error grids and with time-in-range data. From a market-friendly viewpoint, combining these metrics provides a more complete picture of a device’s value to patients and clinicians without over-reliance on a single statistic.
Regulation, innovation, and the weight given to MARD
Some observers argue that heavy emphasis on MARD in the regulatory or reimbursement pathways could prematurely slow innovation or raise barriers to entry for new technologies. Proponents of a pragmatic, innovation-oriented approach argue that MARD remains a useful, transparent benchmark among several, and should be complemented by real-world evidence and usability data rather than used as a single gatekeeper.
Real-world relevance versus controlled studies
Questions persist about how well MARD measured in tightly controlled studies translates to everyday use, where factors like sensor placement, user calibration, physical activity, and environmental conditions come into play. Advocates for a practical stance emphasize the importance of post-market surveillance and real-world performance metrics (such as time in range and user satisfaction) alongside MARD.
Alternatives and complements
- Error-grid analyses: Clarke Error Grid and Parkes Error Grid categorize readings by their potential clinical impact, offering context that a percentage value alone cannot provide.
- Time in range (time in range): reflects the proportion of time glucose levels stay within target bounds, aligning more directly with patient outcomes than a single accuracy statistic.
- Bland-Altman analysis: examines the agreement between two measurement methods across the measurement spectrum and can reveal systematic bias or limits of agreement.
- Regulatory standards: ISO 15197 and similar standards define accuracy requirements and testing protocols that influence device development and conformity assessment.
- Clinical outcome metrics: measures of variability, hypo-/hyperglycemia frequency, and patient-reported outcomes can illuminate how device accuracy translates into real-world benefit.
Regulatory and industry context
Regulators and manufacturers use MARD as a practical, communicable metric to compare devices and to validate performance claims. It forms part of the evidentiary mix used to assess device readiness for clinical use and for market clearance in many jurisdictions. Critics caution that a narrow focus on MARD can obscure broader considerations such as cost, reliability, calibration burden, user experience, and accessibility. Balancing precision with usability and affordability remains a central concern in the industry, as innovators push toward more accurate sensors, longer wear times, and reduced calibration requirements.