Trends are about more than timelines: Using Natural Language Processing to examine change over time in the most common Google searches related to semaglutide

Authors

  • Jacques Raubenheimer University of Sydney

Abstract

Background: Google Trends (GT) data is ubiquitous in research, but most studies are low quality and spuriously attempt to correlate GT timelines with external timelines without clear causal justification. Novo Nordisk's semaglutide (e.g., Ozempic) is a glucagon-like peptide-1 receptor agonist for long-term weight management in type 2 diabetes mellitus (T2DM) patients. Social media promotion has driven off-label demand for non-diabetic weight loss.   Aims: Using additional GT data in responsibly can help researchers understand (not predict) public behaviour. We aim to demonstrate this using GT Top Search Query data, with semaglutide as an example.   Methods: We demonstrate using the GT Extended for Health (GTEH) API to extract top Google searches for a series of time frames, allowing us to examine how these searches have changed over time. We use Natural Language Processing (NLP) to categorise themes within each time period, and then contrast the priority of themes at different time points. We contrast differences and similarities in themes between different countries over time.   Results: We accessed the top quarterly semaglutide-related search queries for 25 different countries for 2021-2024. Thematic analysis reveals differences in key points, such as changes for side-effect concerns (e.g., so-called "Ozempic face"). Weight loss is consistently a more prominent theme than diabetes, indicating the predominance of off-label demand. Searches for accessing semaglutide appear at different times in different countries, and display region-specific insights (e.g., the names of dominant retailers).   Conclusions: GT data cannot predict demand for semaglutide, but can help understand what is driving demand.

Published

2025-09-29

Issue

Section

Oral Presentations