In scientific medicine, randomized controlled trials (RCTs) are the highest level of evidence used to evaluate the most effective interventions. These studies are designed to reduce bias and provide reliable comparisons between treatment groups. However, the interpretation of results for these RCTs is based on statistical significance, most commonly using P values. As RCTs often set gold standards, examine new techniques, and assess outcomes in orthopaedic surgery, it is essential that orthopaedic surgeons be able to critically assess the results of these trials, which commonly shape clinical practice. If the results of landmark trials are based solely on a significant or non-significant P value, there may be direct consequences on selected treatments for patients.
P values offer insights into significant differences between two treatment groups; however, the clinical applicability of these findings has come into question. As a result of thresholds for significance (P < 0.05), these studies can be vulnerable to minor changes in the data or endpoints. This limitation has led to the development of supplementary tools, such as the fragility index (FI), that aim to evaluate the reliability of statistically significant findings (P <0.05) in these types of studies.>
The FI provides a tangible way to assess how “fragile” or “stable” a reported result may be. For example, an FI of 1 indicates that changing the outcome for just one patient would render the result nonsignificant. This score raises concerns if more than one patient was lost to follow-up. Fragility indices have gained early traction in orthopaedic research, where RCTs frequently report binary outcomes such as revision rate or complication rates. These trials commonly guide clinical decision making, yet their findings may be sensitive to small shifts in data. In this context, FI emerged as a potentially valuable tool to help clinicians interpret whether a statistically significant result is also clinically reliable.
Limitations of traditional FI
Traditional FI can only be calculated for statistically significant binary outcomes. It cannot be applied to continuous outcomes or nonsignificant findings, limiting its utility across key clinical metrics and most common patient-reported outcome measures (PROMs).
Additionally, FI depends entirely on the P value, which has long been criticized for its lack of clinical relevance. Recent editorials have raised concerns about the overuse and potential misinterpretation of FI. Critics argue that it adds little beyond conventional statistical reporting and reinforces the false dichotomy between “significant” and “nonsignificant” results.
Another major limitation is the absence of standardized thresholds for what constitutes a fragile or robust trial. While some suggest an FI <3 indicates fragility, and an fi>10 suggests robustness, these benchmarks are arbitrary and lack empirical validation. Without defined criteria, interpretation of fragility indices remains relative and dependent on comparisons of existing literature, making it difficult to assess findings in isolation. Despite these concerns, FI studies continue to be widely published. This trend reflects a persistent need for tools that extend beyond the P value to assess true clinical ramifications and statistical relevance and also underscores the importance of refining how FI is used and interpreted.
Expanding the fragility index
Many studies reporting FI are titled “Fragility of RCTs on…,” implying a comprehensive evaluation of study reliability. This is misleading.
FI only assesses a narrow subset of outcomes and excludes continuous measures that often reflect clinically meaningful change. PROMs, functional scores, and range of motion are all central to orthopaedic decision making but are omitted from the traditional, dichotomized FI analyses.
To address this shortcoming, Caldwell et al. introduced a method for calculating the continuous fragility index (CFI), which expands the fragility framework to include continuous outcomes. CFI estimates how many changes in continuous outcome data would be required to reverse statistical significance. This approach allows for a more complete analysis of trial robustness, particularly for clinically meaningful endpoints. CFI is especially useful in trials with substantial loss to follow-up, just as FI highlights whether small changes could overturn binary outcomes. CFI allows for a similar evaluation of continuous metrics. For example, if a statistically significant improvement in a PROM or functional outcome is based on a small CFI and a large number of patients were lost to follow-up, the significant results may warrant closer evaluation. This approach is particularly important in orthopaedic trials, where PROMs are often the primary outcome and binary endpoints such as complications or mortality may not fully capture treatment efficacy.
Clinical implications and recommendations
Fragility indices are not replacements for traditional statistical measures such as P values and confidence intervals but should be used to complement them. When applied thoughtfully, they can help clinicians identify meaningful results and the need for further evaluation of study design, follow-up, or statistical analyses. For instance, a trial with a CFI of 1 and a significant loss to follow-up should prompt critical appraisal before its findings influence clinical practice.
These tools may be particularly valuable in areas of persistent clinical uncertainty such as proximal humerus fracture management or treatment strategies for rotator cuff tears, where optimal surgical treatment options are not clear. Ultimately, FI and CFI may help clinicians make better-informed decisions by highlighting trials with statistically significant yet potentially unstable findings. An understanding of the FI and CFI can support a more nuanced understanding of even level one evidence and facilitate the use of results for more responsible integration of evidence into clinical care.
Aghdas Movassaghi, BS, is a medical student at Michigan State University College of Human Medicine, East Lansing, Michigan.
Matthew T. McKinley, MBA, is a medical student at Nova Southeastern University Dr. Kiran C. Patel College of Osteopathic Medicine in Davie, Florida.
Jocelyn Lubert, MD is research fellow at the Orthopaedic Center of Palm Beach County in Atlantis, Florida.
Vani J. Sabesan, MD, is a board-certified orthopaedic surgeon and shoulder and elbow specialist in Palm Beach, Florida. She served as chief of quality and held the Lang Family Endowed Chair in Orthopaedic Research at Cleveland Clinic Florida. Before that, she served as program director of the Orthopaedic Residency Training Program at HCA Florida JFK/University of MIAMI.
References
- Tignanelli CJ, Napolitano LM. The fragility index in randomized clinical trials as a means of optimizing patient care. JAMA Surg. 2019;154(1):74-79. doi:10.1001/jamasurg.2018.4318
- Walsh M, Srinathan SK, McAuley DF, Mrkobrada M, Levine O, Ribic C, et al. The statistical significance of randomized controlled trial results is frequently fragile: A case for a Fragility Index. J Clin Epidemiol. 2014;67(6):622-628. doi:10.1016/j.jclinepi.2013.10.019
- Cote MP, Lubowitz JH, Rossi MJ, Matzkin E. The fragility index is typically misinterpreted and of low value: Clinical trials are designed to be fragile. Arthroscopy. Published online August 15, 2024.
- Cote MP, Mazzocca AD, Warner JP. The fragility index minimally improves interpretation of the medical literature: A boat made of bricks in a sea of uncertainty. Arthroscopy. Published online October 16, 2024.
- Oeding JF, Krych AJ, Camp CL, Varady NH. The number of patients lost to follow-up may exceed the fragility index of a randomized controlled trial without reversing statistical significance: A systematic review and statistical model. Arthroscopy. Published online May 21, 2024.
- Caldwell JE, Youssefzadeh K, Limpisvasti O. A method for calculating the fragility index of continuous outcomes. J Clin Epidemiol. 2021;136:20-25. doi:10.1016/j.jclinepi.2021.02.023
- Abesteh J, Al-Asadi M, Khalik HA, et al. The continuous fragility index of outcomes in rotator cuff repair augmentation randomized trials: A systematic review. Shoulder Elbow Surg. Published online December 30, 2024.
- McKinley MT, Movassaghi A, Burzynski C, Smith L, Zhou G, Childers JT, Lubert J, Jackson GR, Sabesan VJ. Evaluating the fragility and robustness of randomized controlled trials in proximal humerus fracture management. JSES International. Published online September 17, 2025.