Positive Alignment: Artificial Intelligence for Human Flourishing

2026-05-11Artificial Intelligence

Artificial IntelligenceComputers and SocietyHuman-Computer Interaction
AI summary

The authors explain that most AI safety work focuses on preventing harm, like avoiding accidents or bad behavior, but that's not enough. They introduce "Positive Alignment," which aims for AI to actively help humans and the environment thrive while staying safe and cooperative. The authors suggest this approach can fix problems like AI tricking people or limiting their freedom by encouraging good habits and diverse viewpoints. They also talk about challenges and ways to build and test AI so it aligns better with what different users and communities want. Finally, they propose designs that allow many groups to have a say, instead of one single authority controlling the AI.

AI alignmentPositive Alignmenthuman flourishingsafetycontrollabilitypolycentric governanceepistemic humilityvalue alignmentlarge language modelscollaborative value collection
Authors
Ruben Laukkonen, Seb Krier, Chloé Bakalar, Shamil Chandaria, Morten Kringelbach, Adam Elwood, Daniel Ford, Fernando Rosas, Maty Bohacek, Matija Franklin, Nenad Tomašev, Stephanie Chan, Verena Rieser, Roma Patel, Michael Levin, Arun Rao
Abstract
Existing alignment research is dominated by concerns about safety and preventing harm: safeguards, controllability, and compliance. This paradigm of alignment parallels early psychology's focus on mental illness: necessary but incomplete. What we call Positive Alignment is the development of AI systems that (i) actively support human and ecological flourishing in a pluralistic, polycentric, context-sensitive, and user-authored way while (ii) remaining safe and cooperative. It is a distinct and necessary agenda within AI alignment research. We argue that several existing failures of alignment (e.g., engagement hacking, loss of human autonomy, failures in truth-seeking, low epistemic humility, error correction, lack of diverse viewpoints, and being primarily reactive rather than proactive) may be better addressed through positive alignment, including cultivating virtues and maximizing human flourishing. We highlight a range of challenges, open questions, and technical directions (e.g., data filtering and upsampling, pre- and post-training, evaluations, collaborative value collection) for different phases of the LLM and agents lifecycle. We end with design principles for promoting disagreement and decentralization through contextual grounding, community customization, continual adaptation, and polycentric governance; that is, many legitimate centers of oversight rather than one institutional or moral chokepoint.