Don't Trust Us: A privacy-by-design android malware detection pipeline
2026-06-02 • Cryptography and Security
Cryptography and Security
AI summaryⓘ
The authors explain that detecting harmful Android apps usually requires collecting private user data, which risks privacy. They propose a new way that avoids accessing sensitive information during detection by first analyzing the app code without user data (static analysis) and only checking unclear cases in a secure, isolated environment (dynamic analysis). Their method correctly identifies malware most of the time and only needs to investigate a small number of apps more deeply. This shows it's possible to find bad apps effectively without compromising user privacy.
Android malware detectionprivacy-by-designstatic analysisdynamic analysissandboxingSVMDrebin datasetvectorizationdual-reject thresholdF1 score
Authors
Emmanuele Massidda, Diego Soi, Giorgio Giacinto
Abstract
Android malware detection increasingly relies on collecting and processing sensitive user data, including device identifiers, network artifacts, and runtime traces, while privacy is too often treated as a secondary concern. Existing privacy-aware approaches typically enforce privacy after data collection, for example, through anonymization, encryption, or federated learning, yet still require access to user information and therefore demand a high level of user trust in systems that already operate with privileged access to device activity. We argue that this requirement should be removed rather than managed. Android malware detection should be privacy-aware by design, so that effective analysis does not depend on sensitive data being accessed in the first place. To this end, we first formalize a set of design requirements for privacy-by-design detection and then implement each requirement in a comprehensive pipeline. First, static analysis is performed to extract relevant data from each APK, following the Drebin representation, which is then submitted to an SVM after vectorization. The model is equipped with a dual-reject threshold rule that either commits to a confident decision or defers uncertain samples to a dynamic analysis stage within a sandboxed environment, so that genuine user information never enters the analysis loop. Results confirm that, on a temporally split dataset spanning from 2024 to 2025, the pipeline achieves an F1 score of 0.87 with the first static analysis stage, deferring only 6.7% of test samples to secondary dynamic analysis. Additionally, dynamic sandboxing helps recognize applications' maliciousness with high confidence without extracting any sensitive data. These results demonstrate that strong detection performance is achievable without sacrificing user privacy.