A longtime leader in healthcare improvement, we’re developing new ways to revolutionize the industry.
Key takeaways:
For more on this topic:
The value-based care movement has fueled the push to better measure healthcare quality. Appropriate measurement that utilizes risk adjustment (RA) models is an essential aspect of improving the quality of care. RA allows hospitals to account for the severity of their patients’ conditions when comparing quality performance across populations and peer facilities.
While many Quality Administrators have a strong understanding of the mechanics of RA and its implications to quality measurement, others perceive RA as somewhat of a mysterious process and may misinterpret its value and use.
While there is a learning curve to understand the underlying statistics of RA, acquiring a basic understanding of this topic is relatively simple and will prove to be valuable in the interpretation of real-world results.
Let’s start with the rationale behind RA as a necessary part of a quality measurement program. To measure variation in quality (the difference between what is observed and expected in performance), a benchmark is needed to understand if the number of observed adverse events is more or less than the benchmark. Each healthcare facility has a unique mixture of patients with a range of characteristics like diagnoses, age, sex and socio-economic factors. It is those characteristics that help estimate the patient’s risk of a particular outcome. This allows analysts to assess the care provided given what is known about similar types of patients in existing data. A comprehensive dataset is important to successful RA, as it allows each patient to receive a risk of a particular outcome based on the RA methodology.
It is a natural thought process to select a custom peer group for benchmarking, including facilities made up of similar patients; however, a comparator that relies on reviewing observed rates across a curated set of facilities is not a viable approach, especially when evaluating clinical subpopulations or individual physicians. Gaps in patient mix will begin to grow between peer facilities when observed rates are aggregated to different levels like disease group and service line.
Gaps in patient mix between peers drive the need for a more robust comparator for quality measures. RA methods are designed to mitigate these challenges by producing an expectation of quality, given the unique clinical and demographic characteristics of the patient population. Deviations between observed and expected performance can be used to uncover systematic variations in care quality by controlling unique patient characteristics.
Not all quality measures should be risk adjusted. For example, we always expect certain evidence-based process measures to be followed, so RA is not needed. When the likelihood of outcomes such as mortality, readmissions and complications will fluctuate based on the types of patients that are being cared for, RA is a necessary tool for measurement.
There are a variety of industry RA models, many of which are used within regulatory incentive and public reporting programs. For example, the AHRQ patient safety indicators; NHSN healthcare acquired infections; and Yale CORE 30-day mortality, readmissions and complication measures are used within programs such as Hospital Value-Based Purchasing Program, Hospital Acquired Condition Reduction Program, Hospital Readmission Reduction Program and the CMS Overall Star Rating Program. Other quality improvement-focused RA methods are available as well. For example, the CareScience® Analytics model maintained by Premier is used by more than 1,300 facilities and is included in programs such as the Premier QUEST® collaborative and in PINC AITM Quality Enterprise solutions.
You may now ask, how is an expected value produced by an RA model? RA models designed for the hospital setting typically produce an expected value for each inpatient stay. For binary outcomes such as mortality, readmissions and complications, the expected value will be the probability of an outcome between 0 and 1. For continuous outcomes, such as length of stay, costs and charges, the expected value is continuous. In each scenario, the expected value should represent the cumulative risk based on the clinical and demographic characteristics of the patient. Often, RA models are based on a form of multiple regression where risk is identified through the linear relationship between a set of patient characteristics and the outcome of interest.
For example, as a patient’s age increases, the likelihood of a readmission is expected to increase, while a diabetic patient might have a heightened risk of a complication, etc. Regression models will measure the increased or decreased risk associated with each individual patient characteristic to produce a cumulative degree of risk in the form of an expected value. In all, those expected values represent the unique risk of the evaluated patient population and serve as a tailored benchmark against which observed performance can be compared.
The resulting observed and expected values can be expressed as a standardized ratio (observed/expected) such that a ratio less than 1 indicates performance better than expected, while a ratio greater than 1 indicates performance worse than expected. Performance can also be measured as “excess events” defined as the raw difference in observed and expected events.
While these types of models are designed to be population based, the benefit of patient-level expected values is that both observed and expected performance can be aggregated to various levels.
The observed/expected ratio or excess events can be measured if there are enough patients at each level of interest in the data (like service line, Medicare Severity Diagnosis Related Group [MS-DRG], physician, facility or system) to make the comparison statistically meaningful.
Once the resulting risk-adjusted performance metric is obtained, a natural question is: At which point is there a meaningful gap in performance? Seemingly small differences between observed and expected values can be quite meaningful, while large differences in another measure might be insignificant. The determination of a significant deviation between the observed events and expected events will change based on the size of the evaluated population and the width of the data distribution itself (e.g., error).
The process of determining statistical significance can be expressed in a variety of ways. It measures the confidence level that the observed and expected values are truly different, while accounting for the natural error in the data and predictive models. It is important to evaluate quality in the context of statistical significance to ensure that variation is meaningful. To do this, it can be helpful to have various levels of confidence (e.g., 75 percent, 95 percent, 99 percent) so that analysts can identify variation that is more likely to be a true difference.
In short, RA provides a tailored performance expectation based on the unique nature of the evaluated patient population. RA measures are frequently coupled with a measure of statistical significance to determine if quality variation is meaningful and beyond normal variation in the data. They are a crucial part of a continuous quality improvement program and can help focus resources on implementing improvement processes and procedures that will maximize the benefit to the patient.
The insights you need to stay ahead in healthcare: Subscribe to Premier’s Power Rankings newsletter and get our experts’ original content delivered to your inbox once a month.