Inflatable colon showing different types of colon conditions. Photo by A Healthier Michigan (CC BY-SA 2.0)
Colorectal cancer is one of the leading causes of cancer-related deaths in the United States, and preventive screening is a critical tool for early detection. Doctors often use risk prediction tools to identify people at high risk for colorectal cancer. Some of these algorithms include race as a variable, which has sparked intense debate in recent years due to concerns that it could exacerbate health disparities.
Now, a new study suggests that race variables could help reduce health disparities under certain conditions. Anna Zink, a researcher who studies machine learning and health at the University of Chicago, and her colleagues focused on a familiar aspect of algorithm construction that is often omitted in discussions of race: the quality of the underlying data.
Age, sex, medical history, and family history of cancer are all important in determining cancer risk. However, black populations are more likely to have incomplete or uncertain family history data due to historical disparities in access to health care. This racial disparity in data could be corrected by including a racial variable, the researchers found. Zink and colleagues tested the impact of this racial difference in a calculator they created to predict colorectal cancer risk.
New research findings
Zink and colleagues used data from 77,836 adults who participated in the Southern Community Cohort Study and who were cancer-free at the start of the study. About two-thirds of the participants identified as black, and the rest identified as white. Black participants were more likely to have an unknown family history of cancer and were less likely to report a positive family history. However, blacks also experienced higher cancer rates during the study period, suggesting that the blacks studied had more incomplete family history data.
The researchers built a model using about half of the data and tested the equations they created with the other half of the data. Using the NIH colon cancer risk calculator as a basis, the researchers built two models, one with and one without race as a variable.
When they tested both, the race-adjusted equation was able to predict cancer risk for black participants more accurately than an equation that didn't take race into account. “If you remove the race variable, you can't predict the colorectal cancer outcome for participants,” Zink said.
In other words, excluding race would result in fewer black participants being identified as high risk. The calculator developed in this study was just an experiment. But “if one of these risk calculators were used to determine who should be recommended for testing, fewer black participants would be recommended for testing,” Zink said.
Considering race in clinical algorithms may reinforce the mistaken idea of conflating race with biology, exacerbating racial disparities. However, some clinicians argue that race should not be removed from risk prediction tools without assessing its impact. In this example, however, considering race improved the algorithm's ability to identify high-risk black participants, allowing them to detect and treat their cancer earlier.
“The debate about whether to include or exclude race is very important,” Zink said, noting that it requires considering data quality for a range of variables that go into the equation, not just the race variable. “That's part of the discussion that needs to be had.”
Key takeaways The report emphasizes that using race in algorithms does not imply biological differences between races, but may help reduce inequities that are the result of historical or ongoing racism. While the current study focuses on family history data, similar racial disparities may exist in other types of medical information. “We focus on family history because it's been well studied, but we don't think this is an isolated issue,” Zink says. Other data types distorted by historical racism could also provide fodder for stories. Considering how the algorithm is used to screen for disease or assign treatments, and how that might affect disparities in access, can help with the decision to include race. “It's helpful to really think through the context in which the algorithm is being used,” Zink says. Ask whether the race variable in the algorithm is intended to compensate for specific data quality issues that are the result of inequities in health care and research. “If you have patients who are certain they have a family history, or you have a dataset where you know the quality of the family history data is high, then racial adjustment may not be necessary,” Zink says. “But if you're dealing with data where you know there's a high rate of missed racial adjustments, then racial adjustment may be helpful.”
Source link