AI summaryⓘ
The authors focus on improving how we predict the different shapes a protein can take at the atomic level, which is important for understanding its function. They point out that traditional methods rely on converting cryo-EM images into atomic models before training prediction tools, which is complicated and slow. Instead, they created a new approach called CryoSampler that fine-tunes existing structure prediction models directly on raw cryo-EM data, skipping the conversion step. This method not only builds more accurate atomic models but also can predict new shapes for related proteins without extra cryo-EM data. Their work suggests a promising way to better predict protein structures by learning straight from raw experimental images.
protein conformational ensembleatomic model buildingcryo-electron microscopy (cryo-EM)heterogeneous reconstructionBoltz-2ensemble prediction modelsfine-tuningstructural biologyprotein structure prediction
Authors
Jay Shenoy, Miro Astore, Axel Levy, Frédéric Poitevin, Sonya M. Hanson, Gordon Wetzstein
Abstract
Knowledge of a protein's atomic conformational ensemble is critical to determining its function, yet state-of-the-art ensemble prediction models are limited by lack of high-quality conformational data from simulation or experiment. Recent advances in heterogeneous reconstruction for cryo-electron microscopy (cryo-EM) have enabled scientists to visualize ensembles of density maps for larger proteins and complexes not typically accessible through simulation, but building atomic models into these maps remains a challenge. Traditionally, ensemble prediction models are trained via a two-stage process: experimental density maps are converted into atomic structural ensembles through model building, after which these structures are used to train sequence-to-atomic ensemble predictors. In this work, we propose a new principle for fine-tuning pre-trained static structure prediction models such as Boltz-2 directly on raw cryo-EM maps, bypassing the two-stage process. We apply this technique to the problem of atomic model building by fine-tuning Boltz-2 to generate atomic conformations from an input ensemble of cryo-EM maps, achieving superior model building accuracy compared to prior work. Beyond overfitting to individual map ensembles, our method, CryoSampler, also shows preliminary evidence of in-domain generalization after fine-tuning, sampling diverse atomic conformations for an unseen sequences within the same protein family without requiring cryo-EM data. These capabilities indicate that CryoSampler holds the potential to train next-generation atomic ensemble prediction models directly on raw cryo-EM measurements.