liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hardening Tree Ensembles: Real-Time and Effective Evasion Defences Beyond Adversarial Re-Training
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0001-6405-4794
2025 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Tree ensembles like random forests and gradient boosting machines are widely used machine learning (ML) models, often outperforming advanced techniques like deep neural networks on structured tabular data tasks. These models also have interpretable (human-understandable) structures that enable stakeholders to trace the decision-making process, making them particularly suitable for use in safety- and security-critical applications where trust in the model’s behaviour is paramount. Despite these advantages, recent work has shown that they are highly vulnerable to adversarial examples: carefully perturbed inputs that elicit misclassifications.

These vulnerabilities are especially concerning as ML continues to permeate domains that are critical to societal functioning. Their seriousness is underscored by legislation such as the recently passed European Union Artificial Intelligence (AI) Act. This act mandates resilience against AI-specific vulnerabilities like evasion attacks caused by adversarial examples targeting ML models at inference time. Measures intended to improve resilience against such evasions, often referred to as hardening, generally involve two strategies: proactive defences, which aim to make models robust (e.g., adversarial re-training), and reactive defences, which focus on detecting and mitigating evasions at inference time. This thesis examines both strategies; it shows that proactive methods like model re-training are ineffective for tree ensembles and consequently advances the state-of-the-art in reactive defences.

In the context of re-training, doubling the training set through targeted data augmentation steps left accuracy largely unchanged. However, robustness, when quantified using formal verification techniques, dropped by 28–82% across two case studies. This indicates that model re-training alone is ineffective for tree ensembles. To address this, we leveraged formal methods to develop Iceman, a prototype system that uses counterexample regions which violate the robustness property to detect evasion attempts. Iceman can detect evasion attacks regardless of the attack generation process without modifying the underlying tree ensemble. It outperforms the current state-of-the-art methods in evasion detection, OC-Score and GROOT. Across four case studies, it improves Matthews Correlation Coefficient scores by 0.20–0.91 and achieves detection speeds 5–115x faster than OC-Score. In addition, it provides alert filtering and prioritisation capabilities with over 98% accuracy to address alert fatigue in intrusion detection systems. However, Iceman’s applicability is limited to scenarios with fixed attacker perturbation budgets, characterised by pre-defined constraints on the input manipulations that an attacker can apply.

To expand this applicability to unconstrained attacker perturbation budgets, we developed an additional system, called Maverick, designed to complement Iceman for a better defensive strategy. Just like Iceman, Maverick does not modify the underlying tree ensemble and can detect evasion attacks regardless of the attack generation process. We prove that Maverick’s core detection mechanism is mathematically equivalent to OC-Score, and present enhancements that achieve 85–563x speedups over OC-Score while maintaining identical detection performance and supporting evasion attack diagnostics with over 93% accuracy.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2025. , p. 31
Series
Linköping Studies in Science and Technology. Licentiate Thesis, ISSN 0280-7971 ; 2023
National Category
Computer Sciences Artificial Intelligence
Identifiers
URN: urn:nbn:se:liu:diva-219415DOI: 10.3384/9789181183269ISBN: 9789181183252 (print)ISBN: 9789181183269 (electronic)OAI: oai:DiVA.org:liu-219415DiVA, id: diva2:2013682
Presentation
2025-12-16, Ada Lovelace, B-building, Campus Valla, Linköping, 13:15 (English)
Opponent
Supervisors
Note

Funding Agencies: This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

Available from: 2025-11-13 Created: 2025-11-13 Last updated: 2025-11-13Bibliographically approved
List of papers
1. Formal Verification of Tree Ensembles against Real-World Composite Geometric Perturbations
Open this publication in new window or tab >>Formal Verification of Tree Ensembles against Real-World Composite Geometric Perturbations
2023 (English)In: Proceedings of the Workshop on Artificial Intelligence Safety 2023 (SafeAI 2023) co-located with the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023) / [ed] Pedroza G., Huang X., Chen X.C., Theodorou A., Hernandez-Orallo J., Castillo-Effen M., Mallah R., McDermid J., CEUR-WS , 2023, Vol. 3381, article id 38Conference paper, Published paper (Refereed)
Abstract [en]

Since machine learning components are now being considered for integration in safety-critical systems, safety stakeholdersshould be able to provide convincing arguments that the systems are safe for use in realistic deployment settings. In the caseof vision-based systems, the use of tree ensembles calls for formal stability verification against a host of composite geometricperturbations that the system may encounter. Such perturbations are a combination of an affine transformation like rotation,scaling, or translation and a pixel-wise transformation like changes in lighting. However, existing verification approachesmostly target small norm-based perturbations, and do not account for composite geometric perturbations. In this work,we present a novel method to precisely define the desired stability regions for these types of perturbations. We propose afeature space modelling process that generates abstract intervals which can be passed to VoTE, an efficient formal verificationengine that is specialised for tree ensembles. Our method is implemented as an extension to VoTE by defining a new propertychecker. The applicability of the method is demonstrated by verifying classifier stability and computing metrics associatedwith stability and correctness, i.e., robustness, fragility, vulnerability, and breakage, in two case studies. In both case studies,targeted data augmentation pre-processing steps were applied for robust model training. Our results show that even modelstrained with augmented data are unable to handle these types of perturbations, thereby emphasising the need for certifiedrobust training for tree ensembles.

Place, publisher, year, edition, pages
CEUR-WS, 2023
Series
CEUR Workshop Proceedings, ISSN 1613-0073 ; 3381
Keywords
Machine Learning, Formal Verification, Tree Ensembles, Composite Perturbations, Geometric Perturbations, Random Forests, Gradient Boosting Machines, Semantic Perturbations, Stability, Robustness, Trustworthy AI, Trustworthy Computing
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-195996 (URN)2-s2.0-85159287306 (Scopus ID)
Conference
The AAAI-23 Workshop on Artificial Intelligence Safety (SafeAI 2023), Washington DC, USA, February 13-14, 2023
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2023-06-30 Created: 2023-06-30 Last updated: 2025-11-13Bibliographically approved
2. Fast Evasion Detection & Alert Management in Tree-Ensemble-Based Intrusion Detection Systems
Open this publication in new window or tab >>Fast Evasion Detection & Alert Management in Tree-Ensemble-Based Intrusion Detection Systems
2024 (English)In: 2024 IEEE 36TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 404-412Conference paper, Published paper (Refereed)
Abstract [en]

Intrusion Detection Systems (IDSs) can help bolster cyber resilience in high-risk systems by promptly detecting anomalies and thwarting security threats which could have catastrophic consequences. While Machine Learning (ML) techniques like Tree Ensembles are well suited for tasks like detecting anomalies, the widespread adoption of these techniques in IDSs faces barriers due to the threat of evasion attacks. Moreover, ML-based IDSs are susceptible to producing a high rate of false positive alerts during detection, causing alert fatigue. To alleviate these problems, we present a method that uses counterexample regions to detect evasion attacks in tree-ensemble-based IDSs. We generate these counterexample regions by defining a modified mapping checker in VoTE, a fast & scalable formal verification tool specialized for tree ensembles. Our method also provides quaternary annotations, empowering security managers with nuanced insights to better handle alerts in the triage queue. Our approach does not require training a separate model and displays good detection performance (≥98 %) in both adversarial & non-adversarial scenarios in four real-world case studies when compared to several approaches in the literature. The prototype system we implement based on our method called Iceman has a very low prediction latency, making it 5-115x faster than the current state-of-the-art in evasion detection for tree ensembles. Finally, empirical evaluations show that Iceman can correctly re-annotate the samples in the presence of evasion attacks for alert management purposes with an accuracy of more than 98 % .

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
Proceedings-International Conference on Tools With Artificial Intelligence, ISSN 1082-3409, E-ISSN 2375-0197
Keywords
Evasion Attacks; Adversarial Defences; Intrusion Detection Systems; Tree Ensembles; Formal Methods
National Category
Computer Sciences Computer Systems
Identifiers
urn:nbn:se:liu:diva-211768 (URN)10.1109/ICTAI62512.2024.00065 (DOI)001447778900056 ()2-s2.0-85217421895 (Scopus ID)9798331527242 (ISBN)9798331527235 (ISBN)
Conference
2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI), Herndon, VA, OCT 28-30, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Funding Agencies|Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2025-02-20 Created: 2025-02-20 Last updated: 2025-11-13
3. Real-Time Evasion Detection in Tree Ensemble Automotive Intrusion Detection Systems
Open this publication in new window or tab >>Real-Time Evasion Detection in Tree Ensemble Automotive Intrusion Detection Systems
2025 (English)In: 16th IEEE Vehicular Networking Conference (VNC), IEEE, 2025Conference paper, Published paper (Refereed)
Abstract [en]

Safety-critical functions in modern vehicles rely on electronic control units that communicate using the controller area network (CAN) protocol, which lacks vital security features. In this context, machine learning (ML) based intrusion detection systems (IDSs) were proposed as a solution to improve cyber resilience through real-time attack detection. However, these ML-IDSs must also withstand evasion attacks that could compromise vehicular safety. To this end, this paper addresses such attacks in misuse-based tree ensemble IDSs and proposes a method that detects evasion attempts. It uses the ordered set of reached leaf nodes activated by correctly classified training samples as a normality baseline. An autoencoder-based detector then identifies deviations as likely evasion attempts. Our approach does not modify the protected tree ensemble IDS, assumes no knowledge of the process for generating adversarial examples (ensuring generalisability), and works with any additive tree ensemble. We also prove that it is mathematically equivalent to the state-of-the-art, which we advance in terms of detection speed by replacing its Hamming distance-based deviation search with an autoencoder-based model of typical predictive behavior trained using our custom loss function. This enhancement results in a detection process that is orders of magnitude faster. Additionally, our method offers nuanced insights regarding the pre-evasion attack signature prior to the adversarial perturbation, thereby enriching the security analysis of the features targeted during evasion attempts. The prototype system we present, called Maverick, has a very low prediction latency, making it 85-563x faster than the current state-of-the-art while maintaining identical detection accuracy. Finally, Maverick predicts the pre-evasion attack signatures of the evasion samples with an accuracy of more than 93% and has an average prediction time well below the message transmission rate for CAN 2.0 and CAN FD, thereby satisfying the criteria for an evasion-hardened & real-time automotive IDS.

Place, publisher, year, edition, pages
IEEE, 2025
Series
IEEE Vehicular Networking Conference, ISSN 2157-9857, E-ISSN 2157-9865
Keywords
Tree Ensembles, Autoencoders, Intrusion Detection Systems, Real-time Systems, Safety, Security, Controller Area Networks, Adversarial Examples
National Category
Computer Systems
Identifiers
urn:nbn:se:liu:diva-216350 (URN)10.1109/VNC64509.2025.11054177 (DOI)001540461700039 ()2-s2.0-105010777746 (Scopus ID)9798331524371 (ISBN)9798331524388 (ISBN)
Conference
2025 IEEE Vehicular Networking Conference (VNC), Porto, Portugal, JUN 02-04, 2025
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Funding Agencies|Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2025-08-14 Created: 2025-08-14 Last updated: 2025-11-13

Open Access in DiVA

fulltext(4284 kB)53 downloads
File information
File name FULLTEXT01.pdfFile size 4284 kBChecksum SHA-512
437fca40c0c1028a87dff7f2dcd4264552d9965b285d6b59d93ddfbe2d842bcb80c9bcfd980e9cb8a6d00a37a74290e4f7a7d2be0045a4ce165d789c003f2723
Type fulltextMimetype application/pdf
Order online >>

Other links

Publisher's full text

Authority records

Oscar Colaco, Valency

Search in DiVA

By author/editor
Oscar Colaco, Valency
By organisation
Software and SystemsFaculty of Science & Engineering
Computer SciencesArtificial Intelligence

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 669 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf