Automatic Generation of Fuzzing Benchmark Suites: Generated Based on Genetic Algorithms
2026 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesisAlternative title
Automatisk generering av testsviter för fuzzers (Swedish)
Abstract [en]
Security testing is performed to find security bugs which are harder to find since security bugs cannot be connected to function requirements. That makes it much harder to find security bugs. To make security testing easier, a technique known as fuzzing is used to generate random or semi-valid data to test the program's ability to handle malformed or malicious input. By catching security bugs before the program is released, the number of potential exploits that could be used by an attacker is decreased. To increase the efficiency of these fuzzers there have been works to create benchmark suites to evaluate fuzzer performance on different types of security bugs. These suites evaluate fuzzers in different ways and the results are usually not comparable. Some benchmark suites generate synthetic bugs while other manually re-introduces bugs or write intentionally buggy programs. The most difficult metric to evaluate fuzzers on is how many unique bugs were found by a fuzzer. A crash in the program from a fuzzer might be caused by a single bug or a set of bugs in the program, and it is very hard to know which bug caused the crash without manual inspection. The aim of this thesis was to develop a benchmark test suite that could automatically generate test suites for fuzzers that could automatically determine which bug caused the program to crash. This was done by re-introducing fixed known vulnerabilities into the latest stable version of a program. A diff of the code for each vulnerability from before it was fixed to the latest version was used and reduced with the use of a genetic algorithm. The genetic algorithm managed to reduce 99,6% of 250 000 lines of code changes to 974 lines of code changes from 25 diffs from five different projects in an average time of 4 hours per project. From the 25 reduced diffs only 19 could be used in the generated test suites with 4 of the selected bugs included on average for each project with a yield of 76% on average. However, more research is needed to investigate how the yield changes when more vulnerabilities are added and how well these re-introduced bugs are found by different fuzzers.
Place, publisher, year, edition, pages
2026. , p. 60
Keywords [en]
Security testing, fuzzing, fuzzing evaluation, vulnerabilities, genetic algorithms
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-224417ISRN: LIU-IDA/LITH-EX-A--26/012--SEOAI: oai:DiVA.org:liu-224417DiVA, id: diva2:2065020
Subject / course
Computer science
Presentation
2026-05-04, John von Neumann, Linköpings universitet, Linköping, 10:15 (English)
Supervisors
Examiners
2026-06-032026-06-022026-06-03Bibliographically approved