liu.seSök publikationer i DiVA
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Turning programs against each other: high coverage fuzz-testing using binary-code mutation and dynamic slicing
Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
2015 (Engelska)Ingår i: 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, New York, NY, USA: Association for Computing Machinery (ACM), 2015, s. 782-792Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Mutation-based fuzzing is a popular and widely employed black-box testing technique for finding security and robustness bugs in software. It owes much of its success to its simplicity; a well-formed seed input is mutated, e.g. through random bit-flipping, to produce test inputs. While reducing the need for human effort, and enabling security testing even of closed-source programs with undocumented input formats, the simplicity of mutation-based fuzzing comes at the cost of poor code coverage. Often millions of iterations are needed, and the results are highly dependent on configuration parameters and the choice of seed inputs. In this paper we propose a novel method for automated generation of high-coverage test cases for robustness testing. Our method is based on the observation that, even for closed-source programs with proprietary input formats, an implementation that can generate well-formed inputs to the program is typically available. By systematically mutating the program code of such generating programs, we leverage information about the input format encoded in the generating program to produce high-coverage test inputs, capable of reaching deep states in the program under test. Our method works entirely at the machine-code level, enabling use-cases similar to traditional black-box fuzzing. We have implemented the method in our tool MutaGen, and evaluated it on 7 popular Linux programs. We found that, for most programs, our method improves code coverage by one order of magnitude or more, compared to two well-known mutation-based fuzzers. We also found a total of 8 unique bugs.

Ort, förlag, år, upplaga, sidor
New York, NY, USA: Association for Computing Machinery (ACM), 2015. s. 782-792
Nyckelord [en]
Fuzz testing, fuzzing, black-box, dynamic slicing, program mutation
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:liu:diva-128810DOI: 10.1145/2786805.2786844ISI: 000382568700067ISBN: 978-1-4503-3675-8 (tryckt)OAI: oai:DiVA.org:liu-128810DiVA, id: diva2:932169
Konferens
10th Joint Meeting on Foundations of Software Engineering
Tillgänglig från: 2016-05-31 Skapad: 2016-05-31 Senast uppdaterad: 2019-08-16
Ingår i avhandling
1. Scalable Dynamic Analysis of Binary Code
Öppna denna publikation i ny flik eller fönster >>Scalable Dynamic Analysis of Binary Code
2019 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

In recent years, binary code analysis, i.e., applying program analysis directly at the machine code level, has become an increasingly important topic of study. This is driven to a large extent by the information security community, where security auditing of closed-source software and analysis of malware are important applications. Since most of the high-level semantics of the original source code are lost upon compilation to executable code, static analysis is intractable for, e.g., fine-grained information flow analysis of binary code. Dynamic analysis, however, does not suffer in the same way from reduced accuracy in the absence of high-level semantics, and is therefore also more readily applicable to binary code. Since fine-grained dynamic analysis often requires recording detailed information about every instruction execution, scalability can become a significant challenge. In this thesis, we address the scalability challenges of two powerful dynamic analysis methods whose widespread use has, so far, been impeded by their lack of scalability: dynamic slicing and instruction trace alignment. Dynamic slicing provides fine-grained information about dependencies between individual instructions, and can be used both as a powerful debugging aid and as a foundation for other dynamic analysis techniques. Instruction trace alignment provides a means for comparing executions of two similar programs and has important applications in, e.g., malware analysis, security auditing, and plagiarism detection. We also apply our work on scalable dynamic analysis in two novel approaches to improve fuzzing — a popular random testing technique that is widely used in industry to discover security vulnerabilities.

To use dynamic slicing, detailed information about a program execution must first be recorded. Since the amount of information is often too large to fit in main memory, existing dynamic slicing methods apply various time-versus-space trade-offs to reduce memory requirements. However, these trade-offs result in very high time overheads, limiting the usefulness of dynamic slicing in practice. In this thesis, we show that the speed of dynamic slicing can be greatly improved by carefully designing data structures and algorithms to exploit temporal locality of programs. This allows avoidance of the expensive trade-offs used in earlier methods by accessing recorded runtime information directly from secondary storage without significant random-access overhead. In addition to being a standalone contribution, scalable dynamic slicing also forms integral parts of our contributions to fuzzing. Our first contribution uses dynamic slicing and binary code mutation to automatically turn an existing executable into a test generator. In our experiments, this new approach to fuzzing achieved about an order of magnitude better code coverage than traditional mutational fuzzing and found several bugs in popular Linux software. The second work on fuzzing presented in this thesis uses dynamic slicing to accelerate the state-of-the-art fuzzer AFL by focusing the fuzzing effort on previously unexplored parts of the input space.

For the second dynamic analysis technique whose scalability we sought to improve — instruction trace alignment — we employed techniques used in speech recognition and information retrieval to design what is, to the best of our knowledge, the first general approach to aligning realistically long program traces. We show in our experiments that this method is capable of producing meaningful alignments even in the presence of significant syntactic differences stemming from, for example, the use of different compilers or optimization levels.

Ort, förlag, år, upplaga, sidor
Linköping: Linköping University Electronic Press, 2019. s. 73
Serie
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1993
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:liu:diva-157626 (URN)10.3384/diss.diva-157626 (DOI)9789176850497 (ISBN)
Disputation
2019-09-25, Planck, Hus F, Campus Valla, Linköping, 13:15 (Engelska)
Opponent
Handledare
Forskningsfinansiär
CUGS (National Graduate School in Computer Science)ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Tillgänglig från: 2019-08-22 Skapad: 2019-08-16 Senast uppdaterad: 2019-08-22Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Person

Kargén, UlfShahmehri, Nahid

Sök vidare i DiVA

Av författaren/redaktören
Kargén, UlfShahmehri, Nahid
Av organisationen
Databas och informationsteknikTekniska fakulteten
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 379 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf