Do Gender Cues Affect LLM Value Trade-offs? Evidence from a Controlled Decision Benchmark
2026-06-01 • Computation and Language
Computation and Language
AI summaryⓘ
The authors created a test to see if large language models (LLMs) make the same value-based decisions regardless of gender cues in scenarios. They found that changing gender details can subtly but consistently change the models' decisions, even when the models say gender had no effect. These changes happen mostly when decisions are less clear or more serious, showing that gender influences are local rather than overriding the model's reasoning. The study suggests that models’ self-reported reasons don’t always reveal these hidden biases, so more careful testing is needed.
large language modelsvalue-sensitive decision makinggender biasbehavioral auditdecision invarianceself-attributionbenchmarkvalue trade-offsrole-gender configurationmodel evaluation
Authors
Yangyang Liu, Dong Yu, Pengyuan Liu
Abstract
Large language models are increasingly used in value-sensitive decision settings, where irrelevant demographic cues should not alter judgments. We construct the Realistic Value Decision Benchmark (RVDB), a controlled benchmark that varies only the role-gender configuration while holding the scenario, ordered value pair, roles, candidate decisions, Value Distance, and Decision Severity fixed. Using a position-balanced evaluation across seven models, we test whether models preserve decision invariance under gender perturbations and whether their self-attributions reflect observed behavioral changes. We find that explicit gender cues induce bounded but systematic decision flips, including under an explicit gender-attribution prompt that asks models to report whether gender influenced their choice. Cross-gender role swaps reveal a consistent female-proposed-decision asymmetry, while models often attribute flipped decisions to No Influence or other non-gender factors. Further analysis shows that gender effects concentrate near less determinate value boundaries and under more severe decision contexts, suggesting that gender cues act as local boundary-shifting factors rather than global overrides of value reasoning. Value rankings remain largely stable, but ordered value-pair trade-offs shift unevenly across role-gender configurations. These results show that gender can enter LLM value trade-offs behaviorally while remaining obscured in self-attribution, motivating controlled behavioral audits beyond explanation-based evaluation.