Chimera: Protocol-Aware Recovery for Confidential BFT Consensus
2026-06-08 • Distributed, Parallel, and Cluster Computing
Distributed, Parallel, and Cluster Computing
AI summaryⓘ
The authors study how Trusted Execution Environments (TEEs) can suffer from rollback attacks, where a system is forced to revert to an old state, threatening reliability in consensus systems. They categorize existing solutions and find they either slow down normal operations or make recovery slow. To fix this, they create CHIMERA, which treats different types of saved data separately and uses tailored recovery methods for each. They prove CHIMERA is safe and live, implement it with Intel TDX, and show it performs better than current methods.
Trusted Execution EnvironmentByzantine Fault Tolerancerollback attackstate continuityconsensus protocolsrecovery latencyIntel TDXBraftZooKeeperMaude modeling
Authors
Tong Liu, Xiaoqing Wen, Ziwei Zhou, Si Liu, Jianyu Niu, Cong Wang, Yinqian Zhang
Abstract
Trusted Execution Environments (TEEs) have enabled confidential Byzantine Fault-Tolerant (BFT) consensus systems with confidentiality and improved scalability. However, TEEs do not provide state continuity: during recovery, a compromised host can roll back a crashed enclave to a stale persistent state, significantly threatening both safety and availability. Existing defenses face a fundamental tradeoff: they either impose substantial overhead on critical consensus paths, reducing throughput and increasing latency, or incur prolonged recovery delays, hurting availability. We present the first systematic taxonomy of rollback-resilient recovery for confidential BFT consensus, distilling prior approaches into four categories. We further expose their inherent limitations. Guided by this detailed analysis, we design CHIMERA, a protocol-aware recovery framework that breaks this tradeoff. Our key insight is that rollback protection in consensus systems should not be uniform. Different types of persistent states differ fundamentally in their state distribution, update behavior, and representation form. CHIMERA separates persistent state into metadata and logs according to these protocol-level properties and applies distinct recovery mechanisms to each type. We formally model CHIMERA in Maude and verify its safety and liveness properties. We implement it on Braft and ZooKeeper using Intel TDX, and evaluate it in both LAN and WAN settings. Results show that CHIMERA achieves higher throughput, lower recovery latency, and better availability than state-of-the-art rollback-resilient baselines.