AHOSKIE, N.C. — The railroad tracks cut through Weyling White’s boyhood backyard like an invisible fence. He would play there on sweltering afternoons, stacking rocks along the rails under the watch of his grandfather, who established a firm rule: Weyling wasn’t to cross the right of way into the white part of town.
The other side had nicer homes and parks, all the medical offices, and the town’s only hospital. As a consequence, White said, his family mostly got by without regular care, relying on home remedies and the healing hands of the Baptist church. “There were no health care resources whatsoever,” said White, 34. “You would see tons of worse health outcomes for people on those streets.”
The hard lines of segregation have faded in Ahoskie, a town of 5,000 people in the northeastern corner of the state. But in health care, a new force is redrawing those barriers: algorithms that blindly soak up and perpetuate historical imbalances in access to medical resources.
A STAT investigation found that a common method of using analytics software to target medical services to patients who need them most is infusing racial bias into decision-making about who should receive stepped-up care. While a study published last year documented bias in the use of an algorithm in one health system, STAT found the problems arise from multiple algorithms used in hospitals across the country. The bias is not intentional, but it reinforces deeply rooted inequities in the American health care system, effectively walling off low-income Black and Hispanic patients from services that less sick white patients routinely receive.
These algorithms are running in the background of most Americans’ interaction with the health care system. They sift data on patients’ medical problems, prior health costs, medication use, lab results, and other information to predict how much their care will cost in the future and inform decisions such as whether they should get extra doctor visits or other support to manage their illnesses at home. The trouble is, these data reflect long-standing racial disparities in access to care, insurance coverage, and use of services, leading the algorithms to systematically overlook the needs of people of color in ways that insurers and providers may fail to recognize.
“Nobody says, ‘Hey, understand that Blacks have historically used health care in different patterns, in different ways than whites, and therefore are much less likely to be identified by our algorithm,” said Christine Vogeli, director of population health evaluation and research at Mass General Brigham Healthcare in Massachusetts, and co-author of the study that found racial bias in the use of an algorithm developed by health services giant Optum.
The bias can produce huge differences in assessing patients’ need for special care to manage conditions such as hypertension, diabetes, or mental illness: In one case examined by STAT, the algorithm scored a white patient four times higher than a Black patient with very similar health problems, giving the white patient priority for services. In a health care system with limited resources, a variance that big often means the difference between getting preventive care and going it alone.
There are at least a half dozen other commonly used analytics products that predict costs in a similar way as Optum’s does. The bias results from the use of this entire generation of cost-prediction software to guide decisions about which patients with chronic illnesses should get extra help to keep them out of the hospital. Data on medical spending is used as a proxy for health need — ignoring the fact that people of color who have heart failure or diabetes tend to get fewer checkups and tests to manage their conditions, causing their costs to be a poor indicator of their health status.
No two of these software systems are designed exactly alike. They primarily use statistical methods to analyze data and make predictions about costs and use of resources. But many software makers are also experimenting with machine learning, a type of artificial intelligence whose increasing use could perpetuate these racial biases on a massive scale. The automated learning process in such systems makes them particularly vulnerable to recirculating bias embedded in the underlying data.
Race, however, is entirely absent from the discussion about how these products are applied. None of the developers of the most widely used software systems warns users about the risk of racial disparities. Their product descriptions specifically emphasize that their algorithms can help target resources to the neediest patients and help reduce expensive medical episodes before they happen. Facing increasing pressure to manage costs and avoid government penalties for readmitting too many patients to hospitals, providers have adopted these products for exactly that purpose, and failed to fully examine the impact of their use on marginalized populations, data science experts said.
The result is the recycling of racism into a form that is less overt but no less consequential in the lives of patients who find themselves without adequate care at crucial moments, when access to preventive services or a specialist might have staved off a serious illness, or even death.
The failure to equitably allocate resources meant to avert health crises is evident in a large body of research. One study found Black patients in traditional Medicare are 33% more likely to be readmitted to hospitals following surgery than white patients; they are also more frequently re-hospitalized for complications of diabetes and heart failure, and are significantly less likely to get referred to specialists for heart failure treatment. These disparities should be mitigated if analytics software was really identifying the neediest patients.
The fallacy of these tools can be seen in places like Ahoskie, an agricultural community framed by fields of soybeans and cotton whose high rates of poverty and unemployment put health care out of reach for many residents. Large segments of its majority-Black population don’t have regular primary care doctors who, therefore, don’t have enough data on their medical problems and prior treatment to accurately assess their needs, or compare them to other patients.
White said he did not start getting regular doctor visits until his late 20s, in part because his family was distrustful of local health care providers and defaulted to using the emergency department for any significant problems. As he grew older, he learned of a history of chronic illnesses in his family that had gone untreated, including his own high blood pressure.
“A lot of my family members struggled,” he said. “My aunt was in a wheelchair and I didn’t realize until I was older that it was because she had suffered a stroke. Many family members suffered from diabetes and hypertension. It was rampant on my mother’s side.”
White, a father of three, said he has dedicated himself to improving the health of his family and the broader community. Last fall, he was elected mayor of Ahoskie, and he works a day job as practice administrator of the community health center, where he monitors productivity and manages daily operations. The health center collects data on patients’ social challenges, such as food and housing insecurity, to help counteract nonmedical problems that contribute to poor health outcomes.
While those data are hard to collect and are not consistently factored into cost prediction algorithms, White said they weigh heavily on the use of health care services in Ahoskie. “We see people coming in sweaty and out of breath because they had to walk here from Ward B,” he said, referring to the historically Black section of town. “Those [disparities] are definitely apparent. People here are extremely sick, and it’s because of chronic illnesses that are out of control.”
Algorithms used to assess patients’ health needs churn in the back offices of health systems nationwide, out of view of patients who are not privy to their predictions or how they are being applied. But a recent study of software built by Optum offered a rare look under the hood.
In crunching the data, researchers found, the software was introducing bias when patients’ health costs were used as a filter for identifying those who would benefit from extra care. STAT obtained previously undisclosed details from the researchers that show how the system’s cost predictions favored white patients.
In one example, a 61-year-old Black woman and 58-year-old white woman had a similar list of health problems — including kidney disease, diabetes, obesity, and heart failure. But the white patient was given a risk score that was four times higher, making her more likely to receive additional services.
In another, a 32-year-old white man with anxiety and hypothyroidism was given the same risk score as a 70-year-old Black man with a longer list of more severe health problems, including dementia, kidney and heart failure, chronic lung disease, high blood pressure, and a prior stroke and heart attack.
Certain details of those cases were altered to protect the privacy of the patients. But authors of the study said they are emblematic of bias that arose from the use of Optum’s cost predictions to measure patients’ health needs. Of the patients it targeted for stepped-up care, only 18% were Black, compared to 82% white. When they revised the algorithm to predict the risk of illnesses instead of cost, the percentage of Black patients more than doubled, to 46%, while the percentage of white patients dropped by a commensurate amount.
The problem wasn’t that the algorithm was inaccurate. Viewed from an actuarial standpoint, it performed exactly as intended and predicted costs reliably. The bias arose because the care of Black patients costs $1,800 per year less on average than the care of white patients with the same number of chronic illnesses. In essence, the algorithm was hampered by a very human flaw — that it was blind to the experience of being a Black patient in a health care system where people of color face racism, tend to be lower-income, and have less insurance coverage and fewer providers in their neighborhoods.
“The core of this bias is that people who need health care for whatever reason don’t get it, which means their costs don’t get recorded, they don’t make it into the system,” said Ziad Obermeyer, a physician and professor at the University of California, Berkeley, and co-author of the study. “These small technical choices make the difference between an algorithm that reinforces structural biases in our society, and one that fights against them and gives resources to the people who need them.”
The study did not identify Optum as the maker of the analytics software; the company’s role was later revealed in a story published by the Washington Post.
Executives at Optum have begun to aggressively push back against the study’s findings. Initially, the company issued a statement expressing appreciation for the researchers’ work, while pointing out that the company’s product — called ImpactPro — collects and analyzes hundreds of data points on patients that can be used to provide a clearer picture of their health needs.
But in recent months, as the study attracted more media coverage amid protests against racism, the company’s tone shifted: “The algorithm is not racially biased,” a company spokesman, Tyler Mason, wrote in a statement emailed to STAT in July. “The study in question mischaracterized a cost prediction algorithm used in a clinical analytics tool based on one health system’s incorrect use of it, which was inconsistent with any recommended use of the tool.”
“These small technical choices make the difference between an algorithm that reinforces structural biases in our society, and one that fights against them. …”
Ziad Obermeyer, University of California, Berkeley
Optum executives also said they do not plan to make any changes to the product, because they believe the 1,700 measures embedded in it provide enough information to eliminate bias that arises from isolated use of the cost-prediction algorithm.
The study was conducted based on the use of Impact Pro by Mass General Brigham, a health system affiliated with Harvard University. The health system was using the tool to help identify patients who would benefit from referral to programs designed to avert costly medical episodes by delivering more proactive care.
Optum advertises the product’s use for that purpose. A prospectus posted on its website says the software can “flag individuals for intervention using Impact Pro’s predictive modeling technology … and identify individuals with upcoming evidence-based medicine gaps in care for proactive engagement.”
A sample profile of a patient with diabetes is included in the document, which contains information on diagnoses and prior care, as well as specific treatment gaps, such as the lack of a physician visit in the prior six months and no evidence of testing for liver function or lipid levels. A summary at the top of the document highlights the patient’s prior annual costs as well as predicted future costs and probability of hospitalization.
Mass General Brigham was using the cost information as a basis for screening patients for intervention programs: Those in the top 97th percentile were automatically referred, while patients in the 55th percentile and above were referred to primary care physicians, who were supplied additional data about the patient and prompted to consider whether they would benefit from more intensive services
Vogeli, the health system’s director of evaluation and research for population health, said several factors were considered in addition to cost, including prior hospitalizations and whether a patient had shown up for their appointments and took medicines as prescribed. She said she doubts any providers are making referral decisions solely based on cost, but that cost-based risk scores are core features of these products that introduce money — and consequently race — into deliberations where they shouldn’t hold sway.
“The risk score isn’t telling you the [health] risk of the population,” Vogeli said. “It’s telling you how expensive that population is going to be, and that’s related to who they are, what their background is, what their economic situation is, and unfortunately, what the color of their skin is.”
STAT found that none of the top products for analyzing patient populations explicitly warn users that racial differences in access to care could skew their referral decisions. Instead, their online brochures promise to accurately identify high-cost patients who could benefit from more proactive care.
In the years following passage of the Affordable Care Act in 2010, their pitches found a newly receptive audience. The law prevented insurers from using data on costs to deny coverage to people with preexisting conditions. But it created incentives for health providers to identify and intervene in the care of high-cost patients, through new arrangements that shared financial responsibility for runaway medical expenses between insurers and hospitals.
By 2019, these algorithms were being used in the care of more than 200 million Americans — essentially applying an actuarial concept of risk to decisions about who might benefit from additional doctor visits or outreach to help manage their blood pressure or depression.
Even as uptake increased, most providers were still learning how to apply the software in clinical settings and eliminate bias. In many cases, key data are missing from records or are simply not collected, leaving the providers with varying levels of information on patients beyond the core data on their costs, said Jonathan Weiner, co-developer and scientific director of the Johns Hopkins ACG predictive modeling system, another product widely used by hospitals to predict costs and health needs of patients.
He described the Science paper as a “wake-up call” and said it prompted ACG to audit its algorithms and consider updates to training materials to inform users about the potential for racial bias. “The bias is embedded in the data that we analyze,” Weiner said. “How do you solve that? You get new data, improve the data we have, and you modify your conceptual framework.”
Weiner added that efforts to apply machine learning in these systems raise particular concerns, because of the lack of granularity in the data. “In radiology, AI will move more rapidly, but that’s because everything has been computerized to the micro pixel,” he said. “But your medical record and my medical record and the medical records of a million people in inner-city Baltimore have not been pixelated to that level yet.“
One software maker whose products were referenced in the study said they are configured in a way that prevents racial bias from arising. L. Gordon Moore, a physician and senior director of clinical strategy and value-based care at 3M, said the company’s analytics software groups patients based on illnesses reported in insurance claims data to help predict costs, but he said it helps providers identify patients that may be disconnected from care. He said the company did not audit its algorithms in response to the study, though he acknowledged the paper issued a crucial warning to the industry.
“The Science paper did a really important thing, which is to call out the absolute imperative to understand bias as we’re doing this work,” Moore said. “We’re very thoughtful about separating cost from what we’re saying about a person’s place on a hierarchy of burden of illness.”
Other software makers said the study has prompted them to take a close look at their products.
“We immediately started examining our models to see if we could determine if there were problems that would need to be corrected,” said Hans Leida, a principal and consulting actuary at Milliman, which makes analytics software called Advanced Risk Adjusters.
He said that auditing its algorithms for racial bias was challenging, however, because of gaps in the underlying data: “A lot of the data that is available that includes health insurance claims and demographic information doesn’t include racial indicators,” Leida said.
He added, however, that the company did not find evidence that care management referrals were biased when it examined a Medicare dataset that included information on race. Leida suggested bias may have been mitigated in the product because its algorithm does not rely on historic costs incurred by patients to predict future costs.
He said the company, and the industry as a whole, is beginning to try to address inherent racial biases in the data by collecting information on patients’ barriers to getting care, such as a lack of transportation or housing insecurity. But such information is time-consuming and costly to gather, and isn’t reported in a standard format.
“Until there is a real demand from society to collect that data, we may not see it in a usable and consistent way across the board,” he said.
If algorithms cannot yet clearly see racial disparities in health care access, they are impossible for Black residents of Ahoskie to miss. The hospital was once segregated into units that served white and Black patients, a common practice during the Jim Crow era that continued here until the 1970s.
The neighborhood around it includes homes with manicured lawns and a large recreational complex with a running trail, soccer and baseball fields, a dog park, and basketball courts with fresh pavement and fiberglass backboards.
It contrasts sharply with streets just a couple miles away, where ripped basketball nets dangle from the rims in a park devoid of children. Instead of a wellness center, there are three tobacco shops — two on the same block — and rows of ranch and shotgun houses are interrupted by rotting structures and overgrown lots. There are no walking trails or stores selling fresh food.
These disparities are also reflected in higher mortality rates and higher rates of chronic disease among Black residents, who comprise about 66% of the community’s population. State data show that Black people in North Carolina are 2.3 times more likely to die from diabetes and kidney disease than white people, 1.4 times more likely to die from stroke, and 9.4 times more likely to die from HIV infection.
“We’ve always known in our area the impact of structural racism and bias,” said Kim Schwartz, chief executive of the Roanoke Chowan Community Health Center, a federally qualified health clinic in Ahoskie. “The system isn’t broken. It was set up this way.”
The health center serves a population of about 20,000 patients. When its staff typed in the federal government’s formula for identifying patients at the highest risk of Covid-19, its software system flagged 17,000 of those patients. The formula uses criteria that would pick out discrete segments of most communities: people of color with chronic health conditions who hold jobs that do not allow them to work from home during the pandemic.
In Ahoskie and its surrounding communities, that is not a small subgroup. It’s a large chunk of the population.
Schwartz said the health center has been collecting data on social determinants — factors such housing status, and access to food and transportation — for more than five years, but it is still hard to see a future when the use of that information is incorporated into algorithms and decision-making processes. Unlike the population health analytics used by large health systems, the social determinants data are not reported in a standard format across providers or tied to any services that could help these patients in North Carolina.
State health officials proposed a referral system based on social determinants data in the Medicaid program, but the policies were never enacted. “All that’s out the window because the budget didn’t get approved, Medicaid transformation didn’t happen, and then Covid hit,” Schwartz said. As a result, she added, the state’s effort to counteract disparities “is a house of cards, and it’s never been anything but a house of cards.”
Meanwhile, residents struggle to get care in a place where bad experiences remain at the front of their minds.
Deborah Morrison, 59, still seethes at the memory of her mother’s death more than a decade ago. Her mother, a 74-year-old former pastor, had begun to suffer from extreme fatigue and shortness of breath, symptoms uncharacteristic of a woman who did yoga and had always exercised.
Morrison said her mother’s primary care doctor diagnosed her with depression, prescribed her medication, and sent her home without referring her to a cardiologist. “She knew it wasn’t right,” Morrison said of her mother. “She knew something was going on in her body. She kept saying to me, ‘I’m not depressed. I know I’m not depressed. My legs are swelling. But I know I’m not depressed.’”
Morrison said her mother did not know where to go for a second opinion, but friends convinced her to visit another clinic in a neighboring county. A physician assistant there told her she needed to see a cardiologist, but that no appointments were available for three weeks.
Morrison, who herself was undergoing treatment for breast cancer at the time, intervened. She pushed to get her mother an appointment a week later in Greenville, a city 90 miles away. The news was terrible: Her mother had undiagnosed congestive heart failure. “The doctor told her, ‘You could go at any time,” Morrison recalled. “‘We cannot put a stent in, we cannot operate, we cannot do anything — your heart is operating at 30 percent.’”
She collapsed and died about one month later, on Deborah’s last day of radiation treatment.
“We lost precious time with her doctor not referring her immediately to a cardiologist,” Morrison said. “Every time I think about it, it makes me angry.”
This is part of a yearlong series of articles exploring the use of artificial intelligence in health care that is partly funded by a grant from the Commonwealth Fund.