We will be performing site maintenance on our learning platform at learn.aaos.org on Sunday, February 5th from 12 AM to 5 AM EST. We apologize for the inconvenience.

Darrel S. Brodke, MD, explores the pros and cons of using patient-reported outcome tools to measure outcomes after spine surgery.

AAOS Now

Published 5/1/2015
|
Jennie McKee

Measuring Patient-Reported Outcomes in Spine Surgery

Experts explore advantages and limitations of outcome measurement tools

It’s an evolving science,” said Mark F. Kurd, MD, of collecting data on patient-reported outcomes after spine surgery.

“There are many outcome measures out there, and we are still figuring out which ones are valid,” he continued. “But the bottom line is that, as spine surgeons, we need to measure outcomes. If we don’t, someone else is going to measure them.”

Dr. Kurd was one of several orthopaedic surgeons specializing in spine care who participated in “Patient-Reported Outcomes in Spine Surgery,” a symposium presented by the AAOS and the Cervical Spine Research Society at the 2015 AAOS Annual Meeting. Dr. Kurd and his colleagues explored Patient-Reported Outcome Measures (PROMs) as they relate to spine surgery, outlined the benefits and limitations of various commonly used outcome tools, and offered tips for implementing PROMs into clinical practice.

The case for PROMs
According to Kern Singh, MD, spine surgeons should track PROMs to provide evidence that spine surgery often leads to positive outcomes.

“We have to justify what we do, not just to ourselves, but to our payers, or we risk not being reimbursed for the procedures we perform. We treat patients who have quality-of-life issues, as opposed to life-or-death issues, and we’re showing very little in the way of science to validate that we’re improving long-term outcomes associated with these individuals,” he said.

Dr. Singh noted that in 2007, the Centers for Medicare & Medicaid Services stated that lumbar artificial disk replacement (LADR) “is not reasonable and necessary for the Medicare population over 60 years of age,” therefore leading to noncoverage of the LADR procedure for these individuals.

“This decision memo helped spur enactment of patient-measured outcomes in our practices,” said Dr. Singh. “The ultimate conclusion is that we need better evidence to justify our life’s work by demonstrating conclusively that spine care improves health outcomes.”

Spine surgeons may use various measurement tools to evaluate back-related function, generalized well-being, disability in a social role, and patient satisfaction with care.

“Pain is probably the most common symptom to measure,” said Dr. Singh, adding that many spine surgeons use the Visual Analog Scale (VAS). “It’s a numerical scale that’s easy to administer quickly. But the problem is that it is unclear what is actually being measured.

“Does a VAS score indicate a patient’s pain on that particular day or in general?” he wondered aloud. “What about the impact of comorbidities, peripheral nerve entrapment, diabetes, and psychosocial factors on back pain?”

Spine surgeons also use the Short-Form 36 Health Questionnaire (SF-36), the Roland-Morris Disability Questionnaire, the North American Spine Society (NASS) Lumbar Spine Outcome Assessment, and the Oswestry Disability Index (ODI).

General health and outcome measures such as the SF-36 are beneficial because they enable comparisons of patients’ health across different medical conditions and provide a comprehensive view of a patient’s health.

However, general health measures are less responsive to changes in specific conditions, are based more commonly on the lower extremity than the upper extremity, and often do not measure factors related to basic quality of life.

PROMs, said Dr. Singh, “capture data on certain things, but not on others, leaving orthopaedists with a myriad of options.”

Using PROMs
When choosing an outcome measure, orthopaedists should first determine their goals and define the scope of the testing—for example, whether a single surgeon will collect the data or whether the entire practice will take part in collecting the data—according to Alpesh A. Patel, MD, symposium moderator. Other factors to consider are the types of patients from whom to obtain data, the duration of data collection, and how data will be captured, entered, and extracted.

Dr. Patel advised the audience to “periodically check the validity of entered data, follow up on missing data in real time,” and to remember the principle of “quality in, quality out.”

Responsiveness—the ability to respond to changes in a patient’s health status—is an ideal use of PROMs, noted Dr. Singh.

“A patient’s baseline may change as a function of time,” he said, explaining that no currently available tools show a patient in dynamic form. As a result, measurements may not reflect changes in a patient’s pain before treatment, after treatment, and at long-term follow-up.

“The smallest change a patient perceives as beneficial should likely be the baseline for determining how to score an outcome measure,” said Dr. Singh. “But there’s no empirical data for most of the scales to validate that difference.”

Like evaluation response-shift, reliability—meaning whether a consistent, reproducible result can be achieved—is an important characteristic of an outcome measure, noted Dr. Kurd.

“Validity is a little more confusing,” he said. “Validity relates to whether the outcome measure is assessing what we want it to assess. This is more difficult to establish, particularly when we lack a gold standard.”

Questions to ask related to validity involve whether the measure can predict future events and whether the tool correlates with other clinical measures and outcome tools.

Dr. Kurd also noted that it is important to use measures that provide a utility score—the measure of the preference of an individual, or society, for a given state of health, using the same scale for all health problems—so that the cost-effectiveness of a treatment can be calculated.

“When we talk about sensitivity and responsiveness of an outcome measure, we have to bring in the concept of floor and ceiling effects,” explained Dr. Kurd. “It is critical to choose the correct outcome measure to assess a particular disease process.

“For example,” he said, “if the scale goes from 1 to 5, and the baseline population has a mean score of 4.75, that’s probably not the right outcome measure for that patient population because there’s very little room for improvement.”

The promise of PROMIS
Darrel S. Brodke, MD,
noted that outcome measures that measure a specific domain—such as physical function or pain—across various kinds of conditions and diseases are “the future of PROMs.” One such domain-specific outcome measure is the Patient Reported Outcomes Measurement Information System (PROMIS), funded by the National Institutes of Health (NIH). He estimates that more than $150 million in funding has gone into developing PROMIS.

“PROMIS measures are divided among broad domains—physical, mental, and social health—and, within those broad domains, there are measures for specific domains. In physical health, for example, there is a measure specifically for physical functioning,” he said.

The PROMIS Physical Function measure uses a bank of 124 questions to assess a patient, comparing the patient to mean scores from the U.S. patient population.

Physical functioning is scored as a T-score, normalized to the general population, from 1 to 100, with someone who is bedridden scoring a 1, and someone such as a decathlete scoring 100. The mean physical function of someone in our population would be indicated by a score of 50.

“Even better is use of Computerized Adaptive Testing (CAT),” said Dr. Brodke, which provides “the accuracy of the whole bank but only requires a few questions to be answered. PROMIS Physical Function Computer Adaptive Test (PF CAT) is the preferred method of delivery.”

When a patient takes the CAT questionnaire, stated Dr. Brodke, “he or she answers the first question and is asked a second question based on the answer to the first question, and then a third question appears that is based on the first two answers, and so on,” he explained. “The computer uses an algorithm that provides a result that is nearly as accurate as if the patient had answered all 124 questions. The algorithm limits the questions to only those that are relevant to the patient, so it only takes a minute to complete the questionnaire.”

According to Dr. Brodke, PROMIS PF CAT will likely be one of several outcome measures used increasingly in spine care to measure outcomes, given its many benefits, including ease of collecting, storing, and using data.

“PROMIS is the 21st century way to obtain patient-reported outcome data, as the focus in medicine continues to shift from volume to value,” he added.

The presenters’ disclosure information, including potential conflicts of interest, can be viewed at www.aaos.org/disclosure

Jennie McKee is a senior science writer for AAOS Now. She can be reached at mckee@aaos.org

Bottom Line

  • Patient-reported outcome measures (PROMs) can be used to justify orthopaedic surgery to payers to ensure continuing reimbursement for procedures focused on quality-of-life issues.
  • PROMs should cover both general health and specific conditions and should provide a utility score that enables the calculation of the cost-effectiveness of a treatment.
  • Instruments such as the PROMIS Physical Function Computer Adaptive Test (PF CAT), which uses a computer algorithm to limit questions to only those relevant to the patient, are efficient and will likely be increasingly used in spine care to measure outcomes.