KnowsTFM: Knowledge-Informed Fine-Tuning of Small Tabular Foundation Models

2026-06-29Machine Learning

Machine LearningArtificial Intelligence
AI summary

The authors explore how to improve small deep learning models for tabular data in specialized fields where data is limited and different from typical training data. They use extra expert knowledge from things like knowledge graphs to guide model fine-tuning. Their method adds structural attention and efficient parameter updates, which helps these small models perform better in niche areas but has little effect on general tasks. They also find that continuing to fine-tune large models can sometimes cause them to lose valuable pretrained information.

tabular datafoundation modelsknowledge graphsfine-tuningstructural attentionparameter-efficient updatesdomain adaptationpretrainingcontinual learning
Authors
Boshko Koloski, Xiangjian Jiang, Senja Pollak, Blaž Škrlj, Mateja Jamnik, Nikola Simidjievski
Abstract
Tabular foundation models have advanced deep learning for tabular data by delivering strong default performance across many small and medium tasks. Yet in niche domains, where data is scarce, high-dimensional, and shifted from the pretraining distribution, they may still fail to outperform carefully designed domain-specific methods. Many such domains also provide curated relational knowledge in the form of knowledge graphs and knowledge banks, but how to use this knowledge to improve and steer \textit{small} specialist tabular foundation models remains unclear. We address this problem through \textbf{Know}ledge-informed fine-tuning of \textbf{s}mall \textbf{T}abular \textbf{F}oundation \textbf{M}odels (\modelname). Specifically, we study nanoscale TabPFN- and TabICL-style variants, pretrained under controlled synthetic prior families and adapted using two complementary mechanisms: structural attention priors derived from knowledge graphs and parameter-efficient low-rank updates. We show that injecting domain-specific structural knowledge during fine-tuning yields meaningful gains over vanilla variants in specialist settings, whereas gains on general-domain tasks are marginal. We further observe that continual fine-tuning of frontier models can trigger collapse of pretrained knowledge and mechanisms.