The CRISTAL Method: Neurosymbolic analysis from AI-synthesized world models

2026-06-29 • Artificial Intelligence

Artificial Intelligence

AI summaryⓘ

The authors present the CRISTAL Method, a new approach that combines symbolic reasoning and machine learning to automate complex investment analysis tasks. It handles uncertain and noisy data better than existing language model tools by using a probabilistic program that can learn and update itself with limited data. CRISTAL also focuses on providing explanations and quantifying uncertainty, which helps make more reliable decisions. The authors tested it on a financial classification task where it performed much better than current models, even with less time and data.

neurosymbolic frameworkinvestment analysisprobabilistic programmingBayesian inferenceuncertainty quantificationlarge language models (LLMs)active learningcode synthesisfinancial data analysis

Authors

Rafael Kaufmann, Felix Neubürger, Michael Walters, Thomas Kopinski, Dimitrije Marković

Abstract

This project introduces the CRISTAL Method (Coherent Reliable Intentional Synthesis of Truthful Analysis Logic), a neurosymbolic framework for automating complex analysis workflows, with fundamental investment analysis as a primary use case. This domain poses major challenges: high structural uncertainty, noisy and subjective data, tight attention budgets, and the need for justified, reproducible decisions. Human analysts often struggle in this domain due to cognitive biases and limitations, suggesting significant value in automation. But while LLM-based agents have been proposed as analytical aids, their limitations -- poor numerical reasoning, unawareness of uncertainty, and lack of reproducibility -- hinder their effectiveness in this context. CRISTAL addresses these gaps through a principled blend of statistical model synthesis, continuous learning, and active learning. Starting from a natural-language prior knowledge curriculum, CRISTAL builds a dynamic, interpretable probabilistic program that enables full Bayesian inference, including uncertainty quantification and budget-aware data acquisition. CRISTAL continually refines its world model during analysis, leveraging LLMs for code synthesis and learning. We validate CRISTAL on a novel benchmark of synthetic equities with rich financial and textual data. On a company classification task, CRISTAL achieves Bayes-optimal accuracy with just 5 examples and a 5-second budget, outperforming state-of-the-art LLMs that plateau around 40\% accuracy even with order-of-magnitude more input data and compute.

View PDFOpen arXiv