Social media has become the latest method to monitor the spread of diseases such as influenza or coronavirus. However, machine learning algorithms used to train and classify tweets have an inherent bias because they do not account for how minority groups potentially communicate health information.
These are the findings made by UTSA researchers in one of the first studies of bias conducted on biomedical content on the microblogging and social networking service Twitter.
The researchers found that simple models offer the fairest system to survey how minority groups communicate health behaviors, such as vaccine adoption or incidence of flu. Without a fair, natural language processing system, governments and other organizations that rely on social media may limit vaccines and other resources used to tackle disease within certain populations.
“The problem is that if machine bias is left unchecked, it can aggravate health disparities instead of improving them,” said Anthony Rios, assistant professor in the Department of Information Systems and Cyber Security in UTSA’s College of Business.
According to Rios, computers are used to monitor and classify millions of tweets to track how disease content spreads. There are many advantages to the use of machine learning, primarily that health organizations can deploy the algorithms quickly and at large geographic scales. Yet surveillance systems are based mostly on one dialect and, in essence, don’t account for how a minority group might use different terms or a specific communicative style. Therefore, organizations can assume incorrectly that healthy behaviors or enough medical supplies exist within certain regions.
In this study, the UTSA scientists analyzed two data sets that examined both bias and fairness on influenza-related tasks, including identifying influenza-related tweets, detecting whether a tweet is about an infection or simply raising awareness, detecting whether a user is discussing themselves or someone else, and identifying vaccine-related tweets.
Bias can be abundant in machine learning methods developed for a wide variety of natural language processing tasks, including how text is classified or how a system learns about words. For instance, machine learning methods can generate word embeddings or vector representations for terms—that is, representations of words a computer can understand along numerical values. But the learned representations may become skewed. In some cases this can lead to potential gender bias in which the word man is similar to doctor, while woman is similar to nurse.
In a review of fairness, which is related to bias, the researchers explored the integrity of the influenza classifiers built using different machine learning algorithms, including linear models and neural networks. In the analysis a very specific definition of fairness was applied. Intuitively, a machine learning model is fair if the predictive performance (its accuracy) is the same when it is applied to two different groups of data for the same task.
“Our task involves detecting influenza-related tweets on social media. Our groups are tweets written in either Standard American English or African American Vernacular English. If an unfair model is applied to geographical regions with a large number of AAE speakers, then it may not perform as the model developers expected. Because the number of speakers of SAE is larger than AAE speakers, a model can be both highly accurate and unfair,” said Rios.
“For influenza-related tasks we found that neural networks were more accurate, but simple machine learning methods produced fairer predictions,” Rios added.
France, South Korea, Australia and Singapore have all deployed COVID-19 applications. Even Apple and Google Android platforms have created built-in software to deploy digital contact tracing among users. However, privacy issues have put governments and technology companies at odds—limiting the information that epidemiologists need to understand the spread of the virus.
“Although there are still privacy and ethical issues in social media use for research, it is potentially a great way to observe health trends, since platforms are agnostic and don’t require people to download anything or check in. Using social media, we can conduct disease surveillance tasks, such as predicting infection rates or estimating infection risk. Moreover, social media can be used to understand the public’s view about potential treatments and vaccinations,” added Rios.
It’s estimated that influenza vaccination rates are lower by 10% among Hispanic and African American communities, resulting in approximately 2,000 preventable deaths per year. Moreover, the timetable for COVID-19 vaccine development is anywhere between six months and two years. It’s for this reason that Rios urges natural language processing data scientists to examine how health-related algorithms are built.
Worldwide coronavirus has resurged in many countries. While in more than 30 U.S. states cases continue to climb leaving local governments with a shortage of contract tracers, a key tool used to contain the disease. It’s for this reason that machine learning offers immediate benefits and new technology to help with digital tracing or predicting potential outbreaks.
There are current limitations to the UTSA analysis. Since most NLP bias research does not analyze public health applications, and curating large biomedical data sets is difficult, the findings are based on small samples. This is why the researchers want to bring more attention to the issue of fairness when scientists build biomedical NLP data sets to train machines to code and classify health-related information written by different populations.
Brandon Lwowski, a UTSA doctoral student and is co-lead in the study, which was funded by the National Science Foundation.