Earlier this year, as the global pandemic started shutting down much of the world, medical researchers quickly moved to better understand the virus and work on a vaccine. Researchers from UCI’s Donald Bren School of Information and Computer Sciences (ICS) took a different approach. As Assistant Professor of Computer Science Sameer Singh explains, “work started as an outlet for the students to positively contribute to the fight against COVID-19 the one way we know how — by throwing machine learning at it!”
However, they first had to figure out how best to apply their expertise. Informatics Professor Sean Young, who has a joint appointment between ICS and UCI’s School of Medicine (SOM), had an idea.
Young had been sitting in on nightly calls with the Emergency Department, talking about COVID-19 patients, the need for more personal protective equipment, and risks for COVID at UCI and within Orange County. “To try to address this, my team began running a number of COVID studies on online communities and noticed a lot of misinformation that was furthering the problem.” Young reached out to his ICS colleague, knowing that Singh’s expertise in natural language process (NLP) could help address the issue.
Singh and Young, along with computer science graduate students Tamanna Hossain, Robert Logan IV, and Yoshitomo Matsubara and SOM staff research associate Arjuna Ugarte, spent the next seven months working on the problem. In November, their paper, “COVIDLIES: Detecting COVID-19 Misinformation on Social Media,” won the Best Paper Award at the NLP COVID-19 Workshop at the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Tackling the “Infodemic”
It turns out that Young and his colleagues weren’t the only ones to notice the threat of misinformation at the start of the pandemic. There were reports of a related “infodemic,” and a study suggested that as many as 800 people might have died worldwide between January and March because of misinformation.
As the team of ICS researchers started taking a closer look, they found countless false statements on Twitter, such as “Coronavirus cannot be spread by practicing holy communion,” “boiled ginger can cure coronavirus,” and “COVID-19 is only as deadly as the seasonal flu.”
Recognizing the need for automated misinformation detection systems, the team of researchers started working on an online tool for verifying information about COVID-19. They collaborated with Nicole Woodruff, a psychobiology undergraduate at UCLA; Aileen Guillen, a medical student at UCI; and ICS undergraduates Armin Abaye, Ali Al-Hakeem, Victoria Rong, Venkata Sai Abhishek and Sadhika Yamasani at UCI.
The plan was to develop a plug-in tool for social media platforms that could notify users when viewing or posting misinformation. “Sameer and his students took the lead on the tech behind this,” notes Young, “and my team and I worked on finding sources of misinformation and categorizing it for the AI models.”
For more information, watch the video presentation about the work:
Evaluating and Advancing Detection Systems
As the researchers outline in their paper, the tool is a first step in using NLP systems to combat COVID-19-related misinformation. They hope that the COVIDLIES dataset they created will help other researchers better understand NLP systems and their deployment. The dataset contains 6,761 expert-annotated tweets to evaluate the performance of misinformation detection systems on 86 different pieces of COVID-19 related misinformation. The researchers evaluate existing NLP systems on this dataset, providing initial benchmarks and identifying key challenges for future models to improve upon.
“This is a paper on the development of an artificial intelligence-style tool to detect misinformation posted online about COVID-19,” says Young. “If we can detect misinformation, then we can develop interventions to prevent its negative impact.” This type of approach could also be applied to other COVID-19-related areas, including misinformation about testing, masks and vaccines. “This is a great example of needed collaborations between medicine and ICS,” says Young.
The researchers were pleased to see their work receive the top award at the NLP workshop.
“I’m really proud of this work,” says Singh. “It’s great to see it being recognized by a community that’s actually thinking about the important problems in this area.”
— Shani Murray