PA-User: Simulating Trust and Verification under AI-Generated Content

2026-06-22Information Retrieval

Information Retrieval
AI summary

The authors created PA-User, a computer model that mimics how people check if online information is true or not, especially when some info is partly made or changed by AI. Their model uses a limited amount of effort for verifying facts, tracks trust in different types of sources, and decides whether to accept, check, or ignore each piece of information. They tested PA-User on a large dataset and found it was better at trusting sources correctly and avoiding mistakes compared to simpler models. This helps understand and predict user behavior when dealing with mixed human and AI-generated content online.

user simulatortrust modelverification effortinformation retrievalprovenancefactualityBeta distributionhuman-AI hybrid contentregret analysisHC3 corpus
Authors
Saber Zerhoudi
Abstract
Most users of online information now assume that some of what they read has been written, edited, or selected by an AI model. Hybrid cases are the hardest to tell apart: human prose rewritten by a language model, AI-curated lists presented as editorial, retrieval-augmented answers composed on the fly from human sources. Users cannot reliably distinguish these cases, and the ongoing cost of checking what is genuine has become part of how they search. Current user simulators in information retrieval do not model this. We propose PA-User, a user simulator with three new components: a detection-effort budget that is spent on verification and recovers between sessions; a trust component that holds a separate Beta belief over the factuality of each source class (domain by provenance) and updates from observed outcomes; and a decision rule that picks accept, verify, or discard for each result, conditional on current trust, current effort, and per-domain stakes. We state two verification-and-validation (V\&V) properties of the framework. The trust posterior converges to the true class factuality (face validity). Each component's contribution to any observable can be isolated by ablation (structural validity). On the HC3 corpus (85,449 paired human and ChatGPT answers in five domains), PA-User reaches a trust-calibration error of $0.162$, against $0.356$ for any configuration without the trust component. PA-User reduces high-stakes regret from $0.171$ to $0.122$ ($29\%$ relative) against an always-accept ablation, and verifies $34.5\%$ of results, half the rate of an ablation with no effort budget. Each single-mechanism ablation isolates one component, which makes the framework individually diagnosable.