PaperFlow: Profiling, Recommending, and Adapting Across Daily Paper Streams

2026-06-05 • Information Retrieval

Information RetrievalArtificial Intelligence

AI summaryⓘ

The authors created PaperFlow, a new system to recommend scientific papers based on how a researcher's interests change day by day. Their method has three parts: building a profile from initial data, ranking daily new papers by combining different signals, and updating recommendations as the user's interests shift over time. They tested their system using a detailed, day-by-day dataset simulating research users and also used expert reviews to check quality. Their results show PaperFlow outperforms existing methods in matching user interests and recommendation quality.

scientific paper recommendationuser profilinginterest driftlongitudinal evaluationrelevance rankingmulti-signal aggregationuser feedbackbenchmark datasethuman evaluation

Authors

Fuqiang Wang, Song Tan, Zheng Guo, Jiaohao Fu, Xinglong Xu, Bihui Yu, Jie Dong, Zheng Sun, Siyuan Li, Jingxuan Wei, Cheng Tan

Abstract

Scientific paper recommendation is typically evaluated as static ranking over a fixed candidate set, yet real scientific reading unfolds as a daily, longitudinal process in which interests shift and feedback accumulates. We introduce PaperFlow, a framework that organizes it into three coupled stages: Profiling, which constructs and maintains a structured, inspectable scholarly profile from heterogeneous cold-start evidence; Recommending, which ranks each date-specific paper stream through multi-signal aggregation under a fixed display budget; and Adapting, which updates user state from semantically distinct feedback signals and models interest drift across days. We further define a longitudinal user-day benchmark that fixes users, dates, candidate pools, visible inputs, and hidden simulated relevance labels under a shared temporal information boundary. The benchmark contains 24 simulated research users, 50 daily paper streams, 1,200 user-day episodes, 20,727 unique papers, and 497,448 episode-paper records. We additionally specify a blind human-evaluation protocol to validate alignment between automatic metrics and expert judgments. Experiments against five scientific recommendation baselines show that PaperFlow achieves the strongest oracle-based ranking, the highest behavioral alignment with simulated reading selections, and the best blind human-evaluation score.

View PDFOpen arXiv