The Age of Curiosity Meets the Age of AI: Benchmarking Child Safety in Large Language Models

2026-05-25 • Computation and Language

Computation and Language

AI summaryⓘ

The authors created KIDBench, a test to check how safe and age-appropriate large language models (LLMs) are for children aged 7 to 11. They used realistic child questions and different ways of telling the model it's talking to a child, finding that clearly saying the child's age helps the model give safer answers. They also found that safety can vary depending on language, culture, and how many conversation turns have passed. Additionally, the authors developed tools called KIDGuardLlama and KIDLlama to help evaluate and improve child-safe AI responses using their benchmark.

Large Language ModelsChild-facing AISafety EvaluationDevelopmental PsychologyBenchmarkingImplicit CuesExplicit Age InstructionsCross-lingual EvaluationMulti-turn SimulationAI Safety

Authors

Samee Arif, Angana Borah, Rada Mihalcea

Abstract

Children increasingly have access to Large Language Models (LLMs), which may expose them to responses that are developmentally inappropriate or require age-sensitive safety, guidance, and boundaries. Existing LLM safety evaluations largely focus on harmful-content avoidance and do not explicitly target child-facing safety. We introduce KIDBench, a benchmark for evaluating child-facing LLM safety for ages 7--11 using a developmental-psychology-grounded LLM-as-a-Judge rubric. KIDBench contains realistic child queries across ten categories, with single-turn prompts and multi-turn child-actor simulations. We compare no-cues prompts with no child context, implicit-cues prompts that suggest a child speaker, and explicit age instructions. Implicit-cues improve scores by 9--47% across models, while explicit age adds a further 10--30% gain. Cross-lingual and cultural evaluations show uneven safety behavior across languages and country contexts. Multi-turn simulations show that child-facing response quality can degrade by 6--24% from the first to worst turn. Beyond evaluation, we introduce KIDGuardLlama, a child-safety evaluator, and KIDLlama, a child-oriented response model, showing how KIDBench supports safer child-facing AI

View PDFOpen arXiv