liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Formal Verification of Input-Output Mappings of Tree Ensembles
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering. (Real-Time Systems Laboratory)ORCID iD: 0000-0002-4073-0417
Linköping University, Department of Computer and Information Science, Software and Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-1485-0802
2020 (English)In: Science of Computer Programming, ISSN 0167-6423, E-ISSN 1872-7964, Vol. 194Article in journal (Refereed) Published
Abstract [en]

Recent advances in machine learning and artificial intelligence are now beingconsidered in safety-critical autonomous systems where software defects maycause severe harm to humans and the environment. Design organizations in thesedomains are currently unable to provide convincing arguments that their systemsare safe to operate when machine learning algorithms are used to implement theirsoftware.

In this paper, we present an efficient method to extract equivalence classes from decision trees and tree ensembles, and to formally verify that their input-output mappings comply with requirements. The idea is that, given that safety requirements can be traced to desirable properties on system input-output patterns, we can use positive verification outcomes in safety arguments.

This paper presents the implementation of the method in the tool VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case studies presented in current literature. We demonstrate that our method is practical for tree ensembles trained on low-dimensional data with up to 25 decision trees and tree depths of up to 20.Our work also studies the limitations of the method with high-dimensionaldata and preliminarily investigates the trade-off between large number of trees and time taken for verification.

Place, publisher, year, edition, pages
2020. Vol. 194
Keywords [en]
Formal verification, Decision tree, Tree ensemble, Random forest, Gradient boosting machine
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-164563DOI: 10.1016/j.scico.2020.102450ISI: 000528192400002OAI: oai:DiVA.org:liu-164563DiVA, id: diva2:1416647
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Funding agencies: Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2020-03-24 Created: 2020-03-24 Last updated: 2022-03-19
In thesis
1. Formal Verification of Tree Ensembles in Safety-Critical Applications
Open this publication in new window or tab >>Formal Verification of Tree Ensembles in Safety-Critical Applications
2020 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In the presence of data and computational resources, machine learning can be used to synthesize software automatically. For example, machines are now capable of learning complicated pattern recognition tasks and sophisticated decision policies, two key capabilities in autonomous cyber-physical systems. Unfortunately, humans find software synthesized by machine learning algorithms difficult to interpret, which currently limits their use in safety-critical applications such as medical diagnosis and avionic systems. In particular, successful deployments of safety-critical systems mandate the execution of rigorous verification activities, which often rely on human insights, e.g., to identify scenarios in which the system shall be tested.

A natural pathway towards a viable verification strategy for such systems is to leverage formal verification techniques, which, in the presence of a formal specification, can provide definitive guarantees with little human intervention. However, formal verification suffers from scalability issues with respect to system complexity. In this thesis, we investigate the limits of current formal verification techniques when applied to a class of machine learning models called tree ensembles, and identify model-specific characteristics that can be exploited to improve the performance of verification algorithms when applied specifically to tree ensembles.

To this end, we develop two formal verification techniques specifically for tree ensembles, one fast and conservative technique, and one exact but more computationally demanding. We then combine these two techniques into an abstraction-refinement approach, that we implement in a tool called VoTE (Verifier of Tree Ensembles).

Using a couple of case studies, we recognize that sets of inputs that lead to the same system behavior can be captured precisely as hyperrectangles, which enables tractable enumeration of input-output mappings when the input dimension is low. Tree ensembles with a high-dimensional input domain, however, seems generally difficult to verify. In some cases though, conservative approximations of input-output mappings can greatly improve performance. This is demonstrated in a digit recognition case study, where we assess the robustness of classifiers when confronted with additive noise.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2020. p. 22
Series
Linköping Studies in Science and Technology. Licentiate Thesis, ISSN 0280-7971 ; 1892
Keywords
Formal verification, Machine learning, Tree ensembles
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-170862 (URN)10.3384/lic.diva-170862 (DOI)9789179297480 (ISBN)
Presentation
2020-12-18, Alan Turing, E Building, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Available from: 2020-10-28 Created: 2020-10-27 Last updated: 2021-04-12Bibliographically approved

Open Access in DiVA

fulltext(641 kB)160 downloads
File information
File name FULLTEXT01.pdfFile size 641 kBChecksum SHA-512
285f6982f73a99fa899ec32dbbc2c53384f5d7d175368ce8028ec5e2ac8520e0446a15bbf0f00b7c3c6ed44df9793f146100f69a066894fa6c4288ded1c15c36
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Törnblom, JohnNadjm-Tehrani, Simin

Search in DiVA

By author/editor
Törnblom, JohnNadjm-Tehrani, Simin
By organisation
Software and SystemsFaculty of Science & Engineering
In the same journal
Science of Computer Programming
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 160 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 479 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf