liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Publications (6 of 6) Show all publications
Törnblom, J. (2025). Efficient Formal Reasoning about the Trustworthiness of Tree Ensembles. (Doctoral dissertation). Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Efficient Formal Reasoning about the Trustworthiness of Tree Ensembles
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Recent advances in machine learning (ML) have demonstrated that statistical models can be trained to recognize complicated patterns and make sophisticated decisions, often outperforming human capabilities. However, current ML-based systems sometimes exhibit faulty behavior, which prevents their use in safety-critical applications. Furthermore, the increased autonomy and complexity of ML-based systems raise additional trustworthiness concerns such as explainability. To mitigate risks associated with these concerns, the European Union has agreed on a legal framework called the AI Act that governs ML-based products that are placed on the European market.

To meet the goals set out by the AI Act with diligence, manufacturers of critical systems that leverage machine learning need new rigorous and scalable approaches to ensure trustworthiness. One natural pathway towards this aim is to leverage formal methods, a class of reasoning approaches that has proved useful in the past when arguing for safety. Unfortunately, these methods often suffer from computational scalability issues, a problem that becomes even more challenging when applied to complex ML-based systems.

This dissertation seeks to assess and improve the trustworthiness of a class of machine learning models called tree ensembles. To this end, a reasoning engine based on abstract interpretation is developed from the ground up, exploiting unique characteristics of tree ensembles for improved runtime performance. The engine is designed to support deductive and abductive reasoning with soundness and completeness guarantees, hence enabling formal verification and the ability to accurately explain why a tree ensemble arrives at a certain prediction.

Through extensive experimentation, we demonstrate speedup improvements of several orders of magnitude compared to current state-of-the-art approaches. More importantly, we show that many classifiers based on tree ensembles are extremely sensitive to additive noise, despite demonstrating high accuracy. For example, in one case study involving the classification of images of handwritten digits, we find that changing the light intensity of a single pixel in an image by as little as 0.1% causes some tree ensembles to misclassify the depicted digit. However, we also show that it is possible to compute provably correct explanations without superfluous information, called minimal explanations, for predictions made by complex tree ensembles. These types of explanations can help to pinpoint exactly which inputs are relevant in a particular classification. Moreover, we explore approaches for computing explanations that are also globally optimal with respect to a cost function, called minimum explanations. These types of explanations can be tailored for specific target audiences, e.g., for engineers trying to improve robustness, for system operators making critical decisions, or for incident investigators seeking the root cause of a hazard.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2025. p. 45
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2454
National Category
Artificial Intelligence
Identifiers
urn:nbn:se:liu:diva-213383 (URN)10.3384/9789181181296 (DOI)9789181181289 (ISBN)9789181181296 (ISBN)
Public defence
2025-08-25, TEMCAS, TEMA Building, Campus Valla, Linköping, 13:15 (English)
Opponent
Supervisors
Note

Funding agencies: This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation. Some computing resources were provided by the Swedish National Infrastructure for Computing (SNIC) and the Swedish National Supercomputer Centre (NSC).

Available from: 2025-04-30 Created: 2025-04-30 Last updated: 2025-05-05Bibliographically approved
Törnblom, J. & Nadjm-Tehrani, S. (2020). Formal Verification of Input-Output Mappings of Tree Ensembles. Science of Computer Programming, 194
Open this publication in new window or tab >>Formal Verification of Input-Output Mappings of Tree Ensembles
2020 (English)In: Science of Computer Programming, ISSN 0167-6423, E-ISSN 1872-7964, Vol. 194Article in journal (Refereed) Published
Abstract [en]

Recent advances in machine learning and artificial intelligence are now beingconsidered in safety-critical autonomous systems where software defects maycause severe harm to humans and the environment. Design organizations in thesedomains are currently unable to provide convincing arguments that their systemsare safe to operate when machine learning algorithms are used to implement theirsoftware.

In this paper, we present an efficient method to extract equivalence classes from decision trees and tree ensembles, and to formally verify that their input-output mappings comply with requirements. The idea is that, given that safety requirements can be traced to desirable properties on system input-output patterns, we can use positive verification outcomes in safety arguments.

This paper presents the implementation of the method in the tool VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case studies presented in current literature. We demonstrate that our method is practical for tree ensembles trained on low-dimensional data with up to 25 decision trees and tree depths of up to 20.Our work also studies the limitations of the method with high-dimensionaldata and preliminarily investigates the trade-off between large number of trees and time taken for verification.

Keywords
Formal verification, Decision tree, Tree ensemble, Random forest, Gradient boosting machine
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-164563 (URN)10.1016/j.scico.2020.102450 (DOI)000528192400002 ()
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Funding agencies: Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2020-03-24 Created: 2020-03-24 Last updated: 2025-04-30
Törnblom, J. (2020). Formal Verification of Tree Ensembles in Safety-Critical Applications. (Licentiate dissertation). Linköping: Linköping University Electronic Press
Open this publication in new window or tab >>Formal Verification of Tree Ensembles in Safety-Critical Applications
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In the presence of data and computational resources, machine learning can be used to synthesize software automatically. For example, machines are now capable of learning complicated pattern recognition tasks and sophisticated decision policies, two key capabilities in autonomous cyber-physical systems. Unfortunately, humans find software synthesized by machine learning algorithms difficult to interpret, which currently limits their use in safety-critical applications such as medical diagnosis and avionic systems. In particular, successful deployments of safety-critical systems mandate the execution of rigorous verification activities, which often rely on human insights, e.g., to identify scenarios in which the system shall be tested.

A natural pathway towards a viable verification strategy for such systems is to leverage formal verification techniques, which, in the presence of a formal specification, can provide definitive guarantees with little human intervention. However, formal verification suffers from scalability issues with respect to system complexity. In this thesis, we investigate the limits of current formal verification techniques when applied to a class of machine learning models called tree ensembles, and identify model-specific characteristics that can be exploited to improve the performance of verification algorithms when applied specifically to tree ensembles.

To this end, we develop two formal verification techniques specifically for tree ensembles, one fast and conservative technique, and one exact but more computationally demanding. We then combine these two techniques into an abstraction-refinement approach, that we implement in a tool called VoTE (Verifier of Tree Ensembles).

Using a couple of case studies, we recognize that sets of inputs that lead to the same system behavior can be captured precisely as hyperrectangles, which enables tractable enumeration of input-output mappings when the input dimension is low. Tree ensembles with a high-dimensional input domain, however, seems generally difficult to verify. In some cases though, conservative approximations of input-output mappings can greatly improve performance. This is demonstrated in a digit recognition case study, where we assess the robustness of classifiers when confronted with additive noise.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2020. p. 22
Series
Linköping Studies in Science and Technology. Licentiate Thesis, ISSN 0280-7971 ; 1892
Keywords
Formal verification, Machine learning, Tree ensembles
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-170862 (URN)10.3384/lic.diva-170862 (DOI)9789179297480 (ISBN)
Presentation
2020-12-18, Alan Turing, E Building, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2020-10-28 Created: 2020-10-27 Last updated: 2021-04-12Bibliographically approved
Törnblom, J. & Nadjm-Tehrani, S. (2019). An Abstraction-Refinement Approach to Formal Verification of Tree Ensembles. In: Computer Safety, Reliability, and Security: SAFECOMP 2019 Workshops, ASSURE, DECSoS, SASSUR, STRIVE, and WAISE, Turku, Finland, September 10, 2019, Proceedings. Paper presented at ​Second International Workshop on Artificial Intelligence Safety Engineering (pp. 301-313). Springer
Open this publication in new window or tab >>An Abstraction-Refinement Approach to Formal Verification of Tree Ensembles
2019 (English)In: Computer Safety, Reliability, and Security: SAFECOMP 2019 Workshops, ASSURE, DECSoS, SASSUR, STRIVE, and WAISE, Turku, Finland, September 10, 2019, Proceedings, Springer, 2019, p. 301-313Conference paper, Published paper (Refereed)
Abstract [en]

Recent advances in machine learning are now being considered for integration in safety-critical systems such as vehicles, medical equipment and critical infrastructure. However, organizations in these domains are currently unable to provide convincing arguments that systems integrating machine learning technologies are safe to operate in their intended environments.

In this paper, we present a formal verification method for tree ensembles that leverage an abstraction-refinement approach to counteract combinatorial explosion. We implemented the method as an extension to a tool named VoTE, and demonstrate its applicability by verifying the robustness against perturbations in random forests and gradient boosting machines in two case studies. Our abstraction-refinement based extension to VoTE improves the performance by several orders of magnitude, scaling to tree ensembles with up to 50 trees with depth 10, trained on high-dimensional data.

Place, publisher, year, edition, pages
Springer, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11699
Keywords
Formal verification, Decision trees, Tree ensembles
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-160869 (URN)10.1007/978-3-030-26250-1_24 (DOI)000561031400027 ()978-3-030-26249-5 (ISBN)978-3-030-26250-1 (ISBN)
Conference
​Second International Workshop on Artificial Intelligence Safety Engineering
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Funding agencies: Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2019-10-11 Created: 2019-10-11 Last updated: 2025-04-30
Törnblom, J. & Nadjm-Tehrani, S. (2019). Formal Verification of Random Forests in Safety-Critical Applications. In: Formal Techniques for Safety-Critical Systems: . Paper presented at Sixth International Workshop on Formal Techniques for Safety-Critical Systems (FTSCS 2018), Gold Coast, Australia, 16 November, 2018 (pp. 55-71). Springer
Open this publication in new window or tab >>Formal Verification of Random Forests in Safety-Critical Applications
2019 (English)In: Formal Techniques for Safety-Critical Systems, Springer, 2019, p. 55-71Conference paper, Published paper (Refereed)
Abstract [en]

Recent advances in machine learning and artificial intelligence are now being applied in safety-critical autonomous systems where software defects may cause severe harm to humans and the environment. Design organizations in these domains are currently unable to provide convincing arguments that systems using complex software implemented using machine learning algorithms are safe and correct.

In this paper, we present an efficient method to extract equivalence classes from decision trees and random forests, and to formally verify that their input/output mappings comply with requirements. We implement the method in our tool VoRF (Verifier of Random Forests), and evaluate its scalability on two case studies found in the literature. We demonstrate that our method is practical for random forests trained on low-dimensional data with up to 25 decision trees, each with a tree depth of 20. Our work also demonstrates the limitations of the method with high-dimensional data and touches upon the trade-off between large number of trees and time taken for verification.

Place, publisher, year, edition, pages
Springer, 2019
Series
Communications in Computer and Information Science, ISSN 1865-0929 ; 008
Keywords
Machine learning, Formal verification, Random forest, Decision tree
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-154368 (URN)10.1007/978-3-030-12988-0_4 (DOI)978-3-030-12987-3 (ISBN)978-3-030-12988-0 (ISBN)
Conference
Sixth International Workshop on Formal Techniques for Safety-Critical Systems (FTSCS 2018), Gold Coast, Australia, 16 November, 2018
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2019-02-07 Created: 2019-02-07 Last updated: 2021-04-12Bibliographically approved
Törnblom, J., Karlsson, E. & Nadjm-Tehrani, S.Finding Minimum-Cost Explanations for Predictions made by Tree Ensembles.
Open this publication in new window or tab >>Finding Minimum-Cost Explanations for Predictions made by Tree Ensembles
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The ability to reliably explain why a machine learning model arrives at a particular prediction is crucial when used as decision support by human operators of critical systems. The provided explanations must be provably correct, and preferably without redundant information, called minimal explanations. In this paper, we aim at finding explanations for predictions made by tree ensembles that are not only minimal, but also minimum with respect to a cost function.

To this end, we first present a highly efficient oracle that can determine the correctness of explanations, surpassing the runtime performance of current state-of-the-art alternatives by several orders of magnitude. 

Secondly, we adapt an algorithm called MARCO from the optimization field (calling the adaptation m-MARCO) to compute a single minimum explanation per prediction, and demonstrate an overall speedup factor of 2.7 compared to a state-of-the-art algorithm based on minimum hitting sets, and a speedup factor of 27 compared to the standard MARCO algorithm which enumerates all minimal explanations.

Keywords
explainable AI, abstract interpretation, tree ensembles, optimization
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-213142 (URN)10.48550/arXiv.2303.09271 (DOI)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

This is a pre-print publication also published at ArXiv, which has not been under subject to peer-review.

Available from: 2025-04-22 Created: 2025-04-22 Last updated: 2025-04-30
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4073-0417

Search in DiVA

Show all publications