Learning Latent Dynamical Causal Processes for Single-Cell Perturbation Prediction

2026-05-25Machine Learning

Machine Learning
AI summary

The authors address the challenge of predicting how individual cells respond to new interventions over time. They highlight that existing methods either focus on static effects or on temporal changes but don’t fully combine both aspects with underlying causes. To fix this, they propose a new model that captures hidden cellular programs and how these programs change dynamically after perturbations. Their approach, called CITE-VAE, is shown through experiments to better predict unseen interventions than previous methods. This work helps to better understand the complex, time-dependent effects of interventions on cells.

single-cell perturbationout-of-distribution generalizationlatent variablescausal generative modeltemporal dynamicsCRISPRvariational autoencodergene expressioncellular programsidentifiability
Authors
Wenkang Jiang, Yuhang Liu, Erdun Gao, Ehsan Abbasnejad, Lina Yao, Javen Qinfeng Shi
Abstract
Single-cell perturbation prediction aims to infer how cells respond to unseen interventions and to achieve out-of-distribution (OOD) generalization, providing a computational route to understanding how perturbations reshape cellular programs over time. Existing machine learning methods have made important progress, but typically capture only one side of the response. Latent causal approaches seek mechanisms that support generalization and interpretation, yet often treat perturbation effects as static outcomes. Temporal models describe how gene expression changes across time, but usually do not explicitly recover the latent causal generative mechanisms driving these changes. In practice, perturbation effects are both latent and dynamical: interventions act through unobserved cellular programs, whose states evolve over time and give rise to observed expression profiles. Motivated by this view, we propose a latent dynamical causal generative model for single-cell perturbation data that jointly captures latent cellular programs, perturbation-conditioned mechanisms, and temporal evolution. We further provide an identifiability analysis showing that, under suitable conditions, the latent causal variables are recoverable up to standard equivalence classes. Guided by this analysis, we develop CITE-VAE, a learning framework for recovering latent cellular programs and their perturbation-driven dynamics from single-cell sequencing data. Experiments on Causal-3DIdent validate the theoretical results and the effectiveness of the proposed method in controlled settings. Additional experiments on real-world CRISPR-based single-cell perturbation data show improved generalization to unseen perturbations compared with state-of-the-art baselines, highlighting the practical robustness of our approach.