SPAIML 2025

First International Workshop on Security and Privacy-Preserving AI/ML

October 26, 2025

Co-located with ECAI 2025 in Bologna, Italy

ABOUT

The workshop focuses on the transformative potential of AI/ML technologies in addressing key challenges in security and privacy across diverse domains. In an era of increas- ing digitalization and interconnectedness, organizations face evolving threats, from sophisticated cyberattacks to complex data privacy concerns. Traditional methods often struggle to adapt to the dynamic nature of these challenges, particularly in scenarios requiring real-time analysis, anomaly detection, and large-scale data management. AI/ML presents a paradigm shift by enabling intelligent, scalable, and proactive approaches to security and privacy. For instance, machine learning models can detect patterns in network traffic indicative of cyberattacks, while AI-driven solutions can enable privacy-preserving data processing through federated learning or differential privacy techniques. By focusing on how AI/ML can be harnessed to safeguard sensitive data and systems across various domains, this workshop aims to advance the state of the art in security and privacy.

The accepted papers will be published in the Proceedings of the 1st International Workshop on Security and Privacy-Preserving AI/ML on CEUR.

CALL FOR PAPERS

The papers of this workshop should highlight approaches, ideas, and concepts in the field of security and privacy while using AI/ML. The main outcome of the workshop is to share the current progress and new ideas in research and industry, as well as establish a community, dedicated to highlight existing and establish new AI/ML approaches improving security and privacy.

The main topics of interests include (but are not limited to):

  • Intrusion Detection
  • Threat Analysis, Intelligence, and Visualization
  • Anomaly Detection and Behavior Analysis
  • Access Control and Authentication Mechanisms
  • Risk Assessment and Automated Incident Response
  • Vulnerability Management and Patch Prioritization
  • Regulatory and Ethical Considerations, e.g., AI Act
  • Compliance with Legislations, e.g., GDPR
  • Trustworthiness
  • Negative Results1

We invite researchers, practitioners, and industry experts to submit original contributions addressing these topics or related areas. Join us in exploring innovative solutions and insights, and fostering discussions to shape the future of improving security and privacy with the help of AI/ML.

1 Not all research leads to fruitful results, trying new ways or methods may surpass the state of the art, but sometimes the hypothesis is not proven or the improvement is insignificant. But failure to succeed is not failure to progress and we also want to provide a platform for sharing insights, experiences, and lessons learned when conducting research in the field of security and privacy. We originally started that movement with the PerFail Workshop.

Important Dates*

Paper Submission: June 13 June 20, 2025 (extended)
(add to or )
Author Notification: July 11, 2025
(add to or )
Camera-ready Submission: July 25, 2025
(add to or )
Workshop Date: October 26, 2025
(add to or )
Paper Submission: June 13 June 20, 2025 (extended)
(add to or )
Author Notification: July 11, 2025
(add to or )
Camera-ready Submission: July 25, 2025
(add to or )
Workshop Date: October 26, 2025
(add to or )
Paper Submission:

June 13 June 20, 2025 (extended)

(add to or )
Author Notification:

July 11, 2025

(add to or )
Camera-ready Submission:

July 23, 2025

(add to or )
Workshop Date:

October 26, 2025

(add to or )

* All dates are AoE (check it here).

TECHNICAL PROGRAM

14:00 - 14:10
Opening Remarks
14:10 - 15:30
Paper Presentations
Alignment and Adversarial Robustness: Are More Human-Like Models More Secure?
Authors: Blaine Hoak, Kunyang Li and Patrick McDaniel
A small but growing body of work has shown that machine learning models which better align with human vision have also exhibited higher robustness to adversarial examples, raising the question: can human-like perception make models more secure? If true generally, such mechanisms would offer new avenues toward robustness. In this work, we conduct a large-scale empirical analysis to systematically investigate the relationship between representational alignment and adversarial robustness. We evaluate 144 models spanning diverse architectures and training paradigms, measuring their neural and behavioral alignment and engineering task performance across 105 benchmarks as well as their adversarial robustness via AutoAttack. Our findings reveal that while average alignment and robustness exhibit a weak overall correlation, specific alignment benchmarks serve as strong predictors of adversarial robustness, particularly those that measure selectivity toward texture or shape. These results suggest that different forms of alignment play distinct roles in model robustness, motivating further investigation into how alignment-driven approaches can be leveraged to build more secure and perceptually-grounded vision models.
Collaborative Reinforcement Learning for Cyber Defense: Analysis of Strategies, and Policies
Authors: Davide Rigoni, Rafael F. Cunha, Frank Fransen, Puck de Haan, Amir Javadpour and Fatih Turkmen
As cybersecurity threats grow in scale and sophistication, traditional defenses increasingly struggle to detect and counter such attacks. Recent advancements leverage reinforcement learning to develop adaptive defensive agents, yet challenges remain, particularly around how agents learn, in what environments, and what strategies they acquire. These complexities intensify in multi-agent scenarios, where coordinating collaborative defenses becomes especially demanding. This paper provides an empirical analysis of (collaborative) reinforcement learning for cybersecurity defense, examining key components such as environment models, RL methods, and agent policies. More specifically, a range of multi-agent reinforcement learning algorithms are compared in the context of CAGE Challenge 4 when determining optimal defense configurations. Moreover, the study evaluates the policies learned by the agents to assess their applicability to real-world scenarios, highlighting areas where agent strategies may diverge from effective, practical defense.
Experimental Evaluation of Non-Natural Language Prompt Injection Attacks on LLMs
Authors: Huynh Phuong Thanh Nguyen, Shivang Kumar, Katsutoshi Yotsuyanagi and Razvan Beuran
Prompt injection attacks insert malicious instructions into large language model (LLM) input prompts to bypass their safety measures and produce harmful output. While various defense techniques, such as data filtering and prompt injection detection, have been proposed to protect LLMs, they primarily address natural language attacks. When faced with unusual, unstructured, or non-natural language (Non-NL) prompt injection, these defenses become ineffective, leaving LLMs vulnerable. In this paper, we present a methodology for evaluating LLMs' ability to handle Non-NL prompt injections, and also propose defense strategies against these attacks. To demonstrate the usability of our methodology, we tested 14 common LLMs to evaluate their existing safety capabilities. Our results showed a high attack success rate across all LLMs when faced with Non-NL prompt injection, ranging from 0.38 to 0.52, which emphasizes the need for stronger defense measures.
Preliminary Investigation into Uncertainty-Aware Attack Stage Classification
Authors: Alessandro Gaudenzi, Lorenzo Nodari, Lance Kaplan, Alessandra Russo, Murat Sensoy and Federico Cerutti
Advanced Persistent Threats (APTs) represent a significant challenge in cybersecurity due to their prolonged, multi-stage nature and the sophistication of their operators. Traditional detection systems typically focus on identifying malicious activity in binary terms - benign or malicious - without accounting for the progression of an attack. However, effective response strategies depend on accurate inference of the attack’s current stage, as countermeasures must be tailored to whether an adversary is in the early reconnaissance phase or actively conducting exploitation or exfiltration. This work addresses the problem of attack stage inference under uncertainty, with a focus on robustness to out-of-distribution (OOD) inputs. We propose a classification approach based on Evidential Deep Learning (EDL), which models predictive uncertainty by outputting parameters of a Dirichlet distribution over possible stages. This allows the system not only to predict the most likely stage of an attack but also to indicate when it is uncertain or the input lies outside the training distribution. Preliminary experiments in a simulated environment demonstrate that the proposed model can accurately infer the stage of an attack with calibrated confidence while effectively detecting OOD inputs, which may indicate changes in the attackers' tactics. These results support the feasibility of deploying uncertainty-aware models for staged threat detection in dynamic and adversarial environments.
15:30 - 16:00
Coffee Break
16:00 - 16:40
Keynote: TBD

16:40 - 17:40
Paper Presentations
Language Models in Cybersecurity: A Comparative Approach to Task-Driven Model Assessment
Authors: Holger Schmidt, Klaus Kaiser and Daniel Spiekermann
Large language models (LLMs) have demonstrated impressive general-purpose capabilities across a wide range of computational tasks. However, their substantial resource demands and integration constraints raise critical concerns for deployment in security-sensitive scenarios. In response, small language models (SLMs) and tiny language models (TLMs) have gained attention as lightweight, adaptable alternatives - especially when operational context and task requirements are well understood. This paper provides a task-driven approach to language model (LM) assessment, emphasizing that the largest LM is not always the optimal choice. We conduct a systematic analysis of representative tasks from cybersecurity, i. e. especially from the fields of secure software development and digital forensics, and extract key technical and operational characteristics. By mapping these characteristics profiles to properties of different LM classes, we identify practical scenarios where SLMs or TLMs are not only sufficient but preferable.
Seed Scheduling in Fuzz Testing as a Markov Decision Process
Authors: Rafael Fernandes Cunha, Luca Müller, Thomas Rooijakkers, Puck de Haan and Fatih Turkmen
Coverage-guided Greybox Fuzzing (CGF) is an effective method for discovering software vulnerabilities. Traditional fuzzers, such as AFL, rely on heuristics for critical tasks like seed scheduling, which often lack adaptability and may not optimally balance exploration with exploitation. This paper presents a novel approach to enhance seed scheduling in CGF by formalizing it as a Markov Decision Process (MDP). We detail the design of this MDP, including the state representation derived from fuzzer and coverage data, the action space encompassing seed selection and power assignment, and a reward function geared towards maximizing coverage and bug discovery. A Proximal Policy Optimization (PPO) agent is then trained to learn a scheduling policy from this MDP within the AFL++ fuzzer. Our investigation into this Deep Reinforcement Learning (DRL) based approach reveals that while the MDP formulation provides a structured framework, practical application faces significant challenges, including high computational demands for training and intensive hyperparameter tuning. The key contributions of this work are: (1) a concrete MDP formulation for the complex task of fuzzer seed scheduling, (2) an analysis of the inherent difficulties and trade-offs in applying DRL to this specific domain, and (3) insights gained from the agent's learning process (or lack thereof), which inform the discussion on the suitability of DRL for this type of optimization problem in fuzzing. This research provides a foundational exploration of DRL for seed scheduling and highlights critical considerations for future advancements in intelligent fuzzing agents.
TAIBOM: Bringing Trustworthiness to AI-Enabled Systems
Authors: Vadim Safronov, Anthony McCaigue, Nick Allott and Andrew Martin
The growing integration of open-source software and AI-driven technologies has introduced new layers of complexity into the software supply chain, challenging existing methods for dependency management and system assurance. While Software Bills of Materials (SBOMs) have become critical for enhancing transparency and traceability, current frameworks fall short in capturing the unique characteristics of AI systems - namely, their dynamic, data-driven nature and the loosely coupled dependencies across datasets, models, and software components. These challenges are compounded by fragmented governance structures and the lack of robust tools for ensuring integrity, trust, and compliance in AI-enabled environments. In this paper, we introduce Trusted AI Bill of Materials (TAIBOM) - a novel framework extending SBOM principles to the AI domain. TAIBOM provides (i) a structured dependency model tailored for AI components, (ii) mechanisms for propagating integrity statements across heterogeneous AI pipelines, and (iii) a trust attestation process for verifying component provenance. We demonstrate how TAIBOM supports assurance, security, and compliance across AI workflows, highlighting its advantages over existing standards such as SPDX and CycloneDX. This work lays the foundation for trustworthy and verifiable AI systems through structured software transparency.
17:40 - 17:50
Closing Remarks & Best Paper Award

REGISTRATION

At least one author of each accepted paper must register for the pre-conference program (“Only weekend”). Accepted papers without a registered author will be considered a no-show and will consequently be excluded from the proceedings. The registration process must be completed on the main page of ECAI.

Registration link: here

COMMITTEES

Organizing Committee

Image

Jens Leicht
University of Duisburg-Essen

Image

Malte Josten
University of Duisburg-Essen

Technical Program Committee

Image

Holger Schmidt
Dortmund University of Applied Sciences and Arts

Image

Kimberly Cornell
University of Albany

Image

Lorenz Schwittmann
Independent Researcher

Image

Maritta Heisel
University of Duisburg-Essen

Image

Meiko Jensen
Karlstad University

Image

Nicolas Diaz-Ferreyra
Hamburg University of Technology

Image

Oliver Hahm
Frankfurt University of Applied Sciences

Image

Razvan Beuran
Japan Advanced Institute of Science and Technology

Image

Simone Fischer-Hübner
Karlstad University

Image

Steffen Bondorf
Ruhr University Bochum

Image

Stephan Sigg
Aalto University

Image

Torben Weis
University of Duisburg-Essen

Image

Vadim Safronov
University of Oxford

Image

Zoltán Mann
University of Halle