Toward a Benchmark for Controllable Simulation of Imperfect Students with Large Language Models

2026-05-25Computation and Language

Computation and LanguageArtificial Intelligence
AI summary

The authors explore whether large language models can act like students who have certain math skills while lacking others, helping teachers practice explaining and diagnosing problems. They create a method to control which skills these model 'students' show and which they do not, testing how well the models follow these instructions. Their experiments show it is possible to make models simulate partial skills, though how well it works depends on the specific model used. This work highlights a new challenge in making smarter teaching tools using AI.

large language modelsteacher educationsimulated studentsskill profilepromptingmodel controlbenchmarkpartial masteryeducational simulation
Authors
Alexander Apartsin, Omri Sason, Yehudit Aperstein
Abstract
Teacher education requires deliberate practice with learners who exhibit identifiable strengths, weaknesses, and partial mastery. Large language models could support such practice by simulating students with known skill components, enabling teachers to rehearse explanations, diagnoses, and instructional responses. For this purpose, however, the central requirement is neither to maximize benchmark accuracy nor to suppress isolated facts, but to control model behavior so that it reflects a specified skill profile. This paper investigates whether prompted language models can be steered to retain some skills while suppressing others. We introduce a benchmark-oriented framework in which an explicit skill vector represents a simulated student, prompt-based control specifies retained and missing competencies, and behavior is evaluated using profile-alignment metrics, retained-versus-forgotten comparisons, and cross-skill calibration analyses. The results show that selective partial mastery can be induced and measured in a structured mathematics setting, although the degree of controllability remains model-dependent. These findings position controllable learner simulation as a distinct research problem at the intersection of teacher education, educational simulation, and language-model control.