AI Unmasks Evolving Vaccine Discourse: A Decade of Social Media Shifts Pre- and Post-COVID
By Nikesh Gyawali, Doina Caragea, Cornelia Caragea, Saif M. Mohammad
Published on November 24, 2025| Vol. 1, Issue No. 1
Content Source
This is a curated briefing. The original article was published on cs.CL updates on arXiv.org.
Summary
This research examines English-language vaccine discourse on X (formerly Twitter) across a decade, from seven years before the COVID-19 pandemic (2013-2019) to three years after its outbreak (2020-2022). Utilizing a novel dataset of 18.7 million curated posts, the study reveals complex shifts in users' sentiment and language around vaccines post-COVID-19. Initially, the pandemic saw a decrease in negative emotion words and a notable rise in words associated with surprise, trust, warmth, and competence. However, this trend reversed towards the end of the pandemic period, with a marked increase in negative word usage, suggesting growing vaccine hesitancy and skepticism among social media users.
Why It Matters
This study offers crucial insights for AI professionals, highlighting the power and complexity of applying Natural Language Processing (NLP) and social media analytics to sensitive, rapidly evolving public discourse. First, it underscores the monumental effort in data curation; transforming 129 million raw posts into a usable 18.7 million-record dataset emphasizes the foundational yet often overlooked work required for robust AI analysis. For AI and machine learning models, the discovery of "complex shifts" in sentiment and emotion words (e.g., trust and surprise rising alongside decreased negativity) illustrates the challenge of interpreting nuance and causality in human language, pushing the boundaries of current sentiment analysis algorithms. This dynamic environment also showcases the critical need for AI models capable of continuous learning and adaptation to avoid 'concept drift,' where models trained on pre-pandemic data might fail to accurately interpret post-pandemic narratives.
Furthermore, this research demonstrates AI's indispensable role in public health intelligence. By tracking evolving sentiments and identifying subtle linguistic shifts, AI can serve as an early warning system for emerging health crises, growing misinformation campaigns, or increasing vaccine hesitancy. This data empowers public health bodies to craft targeted, evidence-based communication strategies, countering harmful narratives more effectively. For AI ethics, the study implicitly raises questions about the responsible use of such powerful monitoring capabilities, emphasizing the need for transparency, privacy, and ensuring these tools are used to inform and protect public health, rather than to manipulate or censor. Ultimately, it solidifies AI's position as a vital tool not just for data analysis, but for safeguarding public discourse and health in an increasingly interconnected and information-saturated world.