One in Eight OpenAlex Abstracts Has Integrity Issues
2026-05-19 • Digital Libraries
Digital LibrariesDatabases
AI summaryⓘ
AI summary unavailable.
Authors
Seorin Kim, Vincent Holst, Vincent Ginis
Abstract
Scientific abstracts are increasingly used as primary data in computational metascience research, yet the quality of these abstracts in widely used bibliographic databases has not been systematically examined. We assess the integrity of 10,000 randomly sampled English-language journal abstracts from OpenAlex using a two-stage annotation protocol combining human expert review and large language model classification. We identify seven distinct failure modes and find that 12\% of abstracts have integrity issues, with insufficient content and misplaced metadata being the most prevalent. We discuss implications for downstream research and describe a forthcoming community portal to support collective annotation efforts.