The EVerest Dataset for Secure Software Engineering

2026-06-22Software Engineering

Software EngineeringCryptography and Security
AI summary

The authors created the EVerest dataset to help study security in software by linking requirements, architecture, and code with detailed security labels. It is based on a real electric vehicle charging software and includes many annotated security details across different types of documents. They also found and helped fix a real security issue during their work. This dataset supports research in security-related tasks such as identifying requirements and tracing architecture links. It is freely available for use by others.

security requirementssoftware architecturesecurity verificationelectric vehicle chargingCWE-1295named entity recognitiontraceabilitysoftware datasetcode-level securitydesign-time verification
Authors
Sophie Corallo, Debora Grupp, Dominik Fuchß, Jan Keim, Frederik Reiche, Tobias Hey, Anne Koziolek
Abstract
End-to-end security verification, from requirements through architecture to code, requires datasets that span all three artifact types with fine-grained security labels. No existing dataset provides this combination. We present the EVerest dataset, a multi-artifact resource based on EVerest, an industry-driven open-source software stack for electric vehicle charging stations. The dataset includes 84 manually elicited security requirements annotated with security objectives, 1,445 fine-grained security elements (components, entities, data, data flows, states, etc.), acceptance windows, coreferences, and architectural trace links, as well as the EVerest software architecture model, source code, and natural language documentation. It enables research on security requirements classification, named entity recognition, architectural trace linking, and design-time or code-level security verification. During dataset creation, a real security weakness (CWE-1295) was identified, disclosed to the project maintainers, and subsequently fixed. The dataset is publicly available. A short video is available at https://youtu.be/pnn1uqpomvQ.