Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions

2026-04-23Computation and Language

Computation and Language
AI summary

The authors studied how large language models (LLMs) handle moral decisions involving whistleblowing, considering how close people are and how serious the crime is. They looked at three views: what is morally right, what humans would likely do, and what the model itself decides. They found that while the models understand fairness as the right thing, they expect humans to be more loyal to close people. However, when models make decisions, they stick to fairness rules instead of their predictions about human loyalty. This suggests LLMs may miss important social details in real-world moral choices.

large language modelsmoral judgmentWhistleblower's Dilemmarelational closenesscrime severitymoral rightnesshuman behavior predictiondecision-makingprescriptive normssocial sensitivity
Authors
Jiseon Kim, Jea Kwon, Luiz Felipe Vecchietti, Wenchao Dong, Jaehong Kim, Meeyoung Cha
Abstract
Human moral judgment is context-dependent and modulated by interpersonal relationships. As large language models (LLMs) increasingly function as decision-support systems, determining whether they encode these social nuances is critical. We characterize machine behavior using the Whistleblower's Dilemma by varying two experimental dimensions: crime severity and relational closeness. Our study evaluates three distinct perspectives: (1) moral rightness (prescriptive norms), (2) predicted human behavior (descriptive social expectations), and (3) autonomous model decision-making. By analyzing the reasoning processes, we identify a clear cross-perspective divergence: while moral rightness remains consistently fairness-oriented, predicted human behavior shifts significantly toward loyalty as relational closeness increases. Crucially, model decisions align with moral rightness judgments rather than their own behavioral predictions. This inconsistency suggests that LLM decision-making prioritizes rigid, prescriptive rules over the social sensitivity present in their internal world-modeling, which poses a gap that may lead to significant misalignments in real-world deployments.