The main goal of clinical orthopaedic research is determining how to best treat patients. Studies are designed to answer the following questions: Do patients improve with a particular intervention? Or do the results indicate that the intervention makes things worse—even to the point of inflicting harm to the patient?
Improvement is measured by many outcomes, including decreased pain, better function, and enhanced quality of life. In determining whether a particular intervention is beneficial, studies that compare the treatment to a placebo are preferred whenever possible. The research gold standard is randomized controlled trials because they can eliminate spurious confounding and biases that might lead to faulty results for the treatment of interest.
In developing evidence-based clinical practice guidelines (CPGs), organizations such as the AAOS rely on an evaluation of the evidence by clinical physician experts and methodologists. The inclusion criteria for the literature is specified before any search is conducted to ensure that the CPGs do not address only articles that support a particular point of view.
The research analysts who extract data from included studies never read an author’s conclusions. They read the methods and statistics sections of the articles and extract the data applicable to a given recommendation. Based on all of the available statistical data from all of the studies included for a recommendation, the AAOS then conducts its own de novo analysis.
Determining “significance”
By convention, “statistical significance” is established when the P-value of a sample statistic is less than 0.05 or when the 95 percent confidence interval for a change does not include zero. The P-value is the probability of wrongly concluding that differences between the groups really exist when in actuality they do not.
A statistically significant finding may or may not indicate that the treatment or potential harm is clinically meaningful because even a miniscule and unimportant difference will be statistically significant if the sample size is sufficiently large to show a difference.
“Clinical significance” means that the effect size is large enough to be important to patients. What constitutes “clinical significance” can be influenced by the views of various parties interested in health care.
Some clinically important effects are defined by physicians. For example, orthopaedic reconstructive surgeons define a pseudotumor associated with metal-on-metal implants as clinically meaningful, even if the patient is asymptomatic.
On the other hand, national or systems health policy makers might base clinical importance on the ability to determine value across a population of patients. Discrete or binary measures of clinical significance are typically easier to understand than continuous measures.
A study evaluating the difference in blood loss between standard and minimally invasive total hip arthroplasty can be used to illustrate the difference between statistical and clinical significance. Even though the reported blood loss between the two patient groups was statistically significant (P < 0.001), the actual difference in blood loss was 52 mL, an amount that made no clinical difference to the patients or their care.
Similarly, a study evaluating the effectiveness of single-injection and continuous peripheral nerve blocks to patient-controlled analgesia for total knee arthroplasty showed a statistically significant (P < 0.05) reduction in pain on the Visual Analog Scale (VAS) for continuous nerve block compared to a single injection. But the difference in VAS was just 0.57, hardly likely to be clinically important. Although statistical significance is important in determining the effectiveness of a treatment, it does not address clinical relevance of any changes measured.
MCII
A number of methods for determining clinical significance have been described, and the AAOS has adopted the minimum clinically important improvement (MCII) when it is available. MCII is conceptually similar to minimally important difference (MID), which generally refers to the smallest amount of change that matters to a patient. The MCII has most commonly been evaluated for the Western Ontario and McMaster Universities Arthritis Index (WOMAC) patient-reported outcome (PRO) tool, although it has been computed for other PROs as well (eg, Oswestry Disability Index).
PROs and health-related quality-of-life assessments attempt to score patients’ conditions and outcomes from the patients’ perspective. Part of the allure of these tools is the ability to evaluate the aggregate of patient-reported outcomes of treatment. However, disagreement exists as to the most meaningful approach for combining MCII with PROs.
Two options are the anchor- and the distribution-based methods. In an anchor-based method, patients are asked to report whether they feel “better” or “worse” after treatment. The weakness to this approach is that it relies on the patients’ recall of their condition prior to treatment. The distribution-based approach calculates MCII according to group score changes in outcomes. However, this approach has also been criticized because it trades one statistical approach to measure significance for another.
Intuitively, MCII—because it is based on changes that matter to patients—is a more accurate approach than statistical significance in determining the true clinical meaningfulness of a treatment. It gives weight to statistical significance, but acknowledges that statistical significance alone is not enough for a patient or a clinician to make an informed decision.
Recognizing the importance of assessing practical relevance to care, the AAOS uses MCII (or MID when MCII is not available) where possible in its evidence-based CPGs. This approach was first applied to the guideline development program in 2008 and has been used ever since when included as part of the evidence.
Critics of the MCII highlight the following two concerns with its use:
- It sets a higher bar in determining a difference between two or more treatments.
- It cannot be applied to between-group analyses (treatment versus placebo).
The AAOS strives to be a leader in defining quality, efficacy, and effectiveness in musculoskeletal health care. The trend toward comparative effectiveness and patient-centered outcomes research mandates that clinically important outcomes in care processes be defined. Although calculation of an MCII begins at the individual level, so do most outcome measures, including pain, implant loosening, and fracture healing. Outcome measures such as WOMAC are routinely applied to patient populations.
MCIIs are calculated as a function of a distribution of patient scores rather than on individuals. The average of changes relative to the MCII is similar to average changes in any other individual item.
Clinical significance in CPGs
The AAOS adheres to the expectation that if a treatment effect is effective in the population, it should meet both statistical significance and be larger than the MCII. This use of MCII might lead reviewers of the literature to overlook some worthwhile effects in undifferentiated subpopulations, but this issue can be addressed through proper subgroup analyses. As long as the populations of interest are comparable to those analyzed in the MCII validation studies, evaluating treatment effects on the basis of the MCII to determine clinical relevance in CPGs is appropriate. Clinical significance must be considered unless it is impossible to make such comparisons.
The best opportunity for determining clinical significance lies in conducting studies that calculate the MCII (or MID) for acceptable orthopaedic measures and then putting them to consistent use in outcomes research. Because much of the evidence currently available evaluates only statistical significance, methodology needs to be added for determining clinical relevance. This will enable practical comparisons of overall effect size when clinical studies are examined cumulatively. The AAOS evidence-based guideline program will continue to integrate techniques for assessing clinical meaningfulness when such information is available in orthopaedic literature that meet the highest of standards.
David S. Jevsevar, MD, MBA, chairs the AAOS Committee on Evidence-Based Quality and Value.
References
- Vavken P, Heinrich KM, Koppelhuber C, Rois S, Dorotka R: The use of confidence intervals in reporting orthopaedic research findings. Clin Orthop Relat Res 2009;467(12):3334–3339.
- Vavken P, Kotz R, Dorotka R: Minimally invasive hip replacement: A meta-analysis. Z Orthop Unfall 2007;145(2):152–156.
- Chan EY, Fransen M, Sathappan S, Chua NH, Chan YH, Chua N. Comparing the analgesia effects of single-injection and continuous femoral nerve blocks with patient controlled analgesia after total knee arthroplasty. J Arthroplasty. 2013;28(4):608-613. doi: 10.1016/j.arth.2012.06.039. Epub 2012 Nov 8.