Preserving Data Privacy in Learning Causal Structure with Fully Homomorphic Encryption
2026-06-03 • Cryptography and Security
Cryptography and SecurityMachine Learning
AI summaryⓘ
The authors address the problem of protecting data privacy when learning cause-and-effect relationships from distributed data. They use a technique called fully homomorphic encryption (FHE), which allows computations to be done on encrypted data without decrypting it, keeping data secure during processing. To overcome FHE's limitations with complex math operations and high computation costs, they develop new methods such as simplifying circuits, approximating difficult functions, and speeding up calculations with batching. Their experiments show their approach produces results similar to working on normal data, and it runs efficiently in reasonable time even with privacy protection.
Data privacyDistributed causal structure learningFully homomorphic encryption (FHE)Circuit simplificationNewton-Raphson methodTaylor expansionSIMD accelerationDifferential privacyData mining
Authors
Jian Yang, Yuan Tong, Qinbin Li, Zeyi Wen, Xiaofang Zhou
Abstract
Preserving data privacy is an important topic in structural data management and data mining. However, the issue of privacy leakage in distributed causal structure learning is a persistent challenge, especially in cases where data transmission and computation are required. In this paper, we propose a method based on fully homomorphic encryption (FHE) that performs calculations on ciphertexts, keeping data encrypted in transition and computation. Nevertheless, adopting FHE to causal structure learning is challenging due to the high computation cost and limited support on division as well as logarithm operations in FHE. To tackle this challenge, we propose a series of novel techniques including (i) circuit simplification for better efficiency, (ii) approximation of division and logarithm through Newton-Raphson Reciprocal and Taylor expansion, and (iii) a batching technique with SIMD-acceleration to enhance the whole learning process. Additionally, our method can be easily extended beyond FHE by demonstration of its portability to support differential privacy. Empirical results show that our method achieves high consistency and comparable causal structure with the plaintext version in the datasets tested. Last, our method is efficient and practical to complete learning causal structures in tens of minutes even under the privacy protection of FHE.