Can Social Determinants of Health Predict Your Patients’ Futures

By: Tim Suther, Nicole Hobbs, Jeff McGinn, and Matt Turner with Change Healthcare; John Halamka, MD, MS, president of Mayo Clinic Platform; and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform


Learn how better health outcomes require a better understanding of life outside the doctor’s office.

Can Social Determinants of Health Predict Your Patients’ Futures?

The evidence is mixed but suggests that these overlooked variables have a profound impact on each patient’s journey

By one estimate, social determinants of health (SDoH) influence up to 80% of health outcomes. Although reports like this suggest that these social factors have a major impact, thought leaders continue to debate whether they can also enhance the accuracy in predictive models. Resolving that debate is far from simple because the answer depends on the type, source, and quality of the data, and the design of the model under consideration.

In general, we derive SDoH from subjective and objective sources. Subjective data includes self-reported or clinician-collected data such as patient-reported outcomes, Z codes from ICD-10-CM that report factors that influence health status and interactions with health service providers, and other unstructured EHR data. Objective data encompasses individual-level and community-level data from government, public, and private (including consumer behavior) sources, which is usually more structured and often derived from national-level datasets.

Unfortunately, the research on the value of SDoH in predictive models varies widely. Some studies report no appreciable differences when SDoH are injected into models, while others report significant enhancements to predictive power. Unsurprisingly, these varying study results depend in part on levels of reliance on traditional clinical models and, most importantly, on the types and sources of SDoH data employed in the studies.

For example, a group from Johns Hopkins Bloomberg School of Public Health demonstrated SDoH predictive models can fail in part due to predictive model design as well as to EHR-level data that is unstructured and collected inconsistently. They also demonstrated that dependence on data from EHR-derived population health databases for SDoH can be problematic because the data tends to be used as a proxy for individual-level social factors. The problem lies in the fact that these proxies are often based on assumptions, not evidence. Other research supports the above and showcases the challenges of using SDoH data from sources that traditionally struggle with the comprehensive collection and standardization of these data types.

On a more positive note, several studies and healthcare articles have reported success by relying on objectively collected and/or highly structured and consistent data. For example, one study that used EHR-derived SDoH data sources found that the addition of structured data on median income, unemployment rate,  and education from trustworthy non-EHR sources enhanced their model’s health prediction granularity for some of the most vulnerable subgroups of patients. In another study, collaboration between Stanford, Harvard, and the Imperial College London found that adding structured SDoH data from the U.S. Census, along with using machine-learning techniques, improved risk prediction model accuracy for hospitalization, death, and costs. They also showed that their models based on SDoH alone, as well as those based on clinical comorbidities alone, could predict health outcomes and costs. Similarly, researchers at The Ohio State University College of Medicine added community-level and consumer behavior data not available in standard EHR data and found it enhanced the study of, and impact on, obesity prevention. Juhn et. al. at Mayo Clinic tapped telephone survey data and appended housing and neighborhood characteristic data from local government sources to create a socioeconomic status index (HOUSES). They first showed that HOUSES correlated well with outcome measures and later showed that HOUSES could even serve as a predictive tool for graft failure in patients.

Patient Level SDoH + Clinical Data = Predictive Power

Incorporating social factors into the healthcare equation can fill gaps needed at the point of care. It also generates better healthcare predictions, but only when these determinants are patient level and linked to robust clinical data. Change Healthcare, for example, has curated such an integrated national-level dataset, linking billions of historical de-identified distinct medical claims with patient-level social, physical, and behavioral determinants of health. One of this dataset’s most important uses is to understand the relative weight of specific patient SDoH factors, in comparison to clinical factors alone, for various therapeutic conditions, including COVID-19. For example, across Change Healthcare’s research, economic stability is repeatedly ranked as the highest or among the highest predictors of the healthcare experience. Despite this realization, most end users, including providers and payers, lack such visibility (or rely on geographic averages that are unhelpful in making accurate predictive models).

Incorporating SDoH data into predictive models holds much promise. Given the relative newness of SDoH data in predictive analytics, along with a lack of data standardization and scale, it’s not surprising to find varying degrees of success in using it to improve predictive health models. But as researchers learn more about the best types and sources of SDoH data to use, along with developing better-suited models for these types of data, we’re likely to see significant advances in healthcare predictive models. By combining the right data with the right models, SDoH are a powerful asset in predictive models of health, outcomes, and potential health disparities.

Related Insights

View all Insights