liu.seSearch for publications in DiVA
Endre søk
Link to record
Permanent link

Direct link
Kargén, Ulf, Doktor
Publikasjoner (10 av 10) Visa alla publikasjoner
Kargén, U. & Varro, D. (2024). Towards Automated Test Scenario Generation for Assuring COLREGs Compliance of Autonomous Surface Vehicles. In: Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems: . Paper presented at ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems, Linz, Austria, September 22-27 2024 (pp. 249-256). New York, NY, USA: Association for Computing Machinery (ACM)
Åpne denne publikasjonen i ny fane eller vindu >>Towards Automated Test Scenario Generation for Assuring COLREGs Compliance of Autonomous Surface Vehicles
2024 (engelsk)Inngår i: Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems, New York, NY, USA: Association for Computing Machinery (ACM), 2024, s. 249-256Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

International maritime traffic is controlled by collision-avoidance regulations (COLREGs) with 41 standardized rules describing how a vessel should navigate in the proximity of other vessels. Since some rules can be overridden by human judgement when resolving critical encounters of vessels, justifying COLREGs compliance has become a significant challenge in the increasing presence of autonomous surface vehicles (ASVs) operated without (or with only remote) human control. This paper provides a high-level framework and long-term research agenda towards the automated synthesis of test scenarios to assure COLREGs compliance for ASVs by exploiting various model-driven engineering techniques. By adapting ideas from testing self-driving cars, we envisage a multi-layered test scenario generation approach involving functional, logical and concrete scenarios. In the current paper, we demonstrate how functional scenarios of COLREGs situations between given vessels can be precisely formalized by using metamodels, domain-specific graph models and first-order logic graph constraints. By using automated model generation techniques, we derive a complete set of functional-level test scenarios, which includes all possible COLREGs situations that may arise between given vessels. As initial result, we provide several dangerous situations involving only three vessels where a potential collision may occur even when all vessels follow the COLREGs, which showcases that some COLREGs rules need further clarification for the safe regulation of ASVs.

sted, utgiver, år, opplag, sider
New York, NY, USA: Association for Computing Machinery (ACM), 2024
Emneord
COLREGs, autonomous surface vehicles, consistent model generation, qualitative abstraction, test scenario generation
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-208761 (URN)10.1145/3640310.3674098 (DOI)001322650200020 ()2-s2.0-85206384904 (Scopus ID)9798400705045 (ISBN)
Konferanse
ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems, Linz, Austria, September 22-27 2024
Forskningsfinansiär
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Merknad

Funding Agencies|Office of Naval Research [N62909-24-1-2006]; Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Tilgjengelig fra: 2024-10-23 Laget: 2024-10-23 Sist oppdatert: 2024-12-19bibliografisk kontrollert
Kargén, U., Härnqvist, I., Wilson, J., Eriksson, G., Holmgren, E. & Shahmehri, N. (2022). desync-cc: An Automatic Disassembly-Desynchronization Obfuscator. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering: . Paper presented at 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, Virtual Conference, March 15-18, 2022 (pp. 464-468). IEEE Computer Society
Åpne denne publikasjonen i ny fane eller vindu >>desync-cc: An Automatic Disassembly-Desynchronization Obfuscator
Vise andre…
2022 (engelsk)Inngår i: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, IEEE Computer Society, 2022, s. 464-468Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Code obfuscation is an important topic, both in terms of defense, when trying to prevent intellectual property theft, and from the offensive point of view, when trying to break obfuscation used by malware authors to hide their malicious intents. Consequently, several works in recent years have discussed techniques that aim to prevent or delay reverse-engineering of binaries. While most works focus on methods that obscure the program logic from potential attackers, the complimentary approach of disassembly desynchronization has received relatively little attention. This technique puts another hurdle in the way of attackers by targeting the most fundamental step of the reverse-engineering process: recovering assembly code from a program binary. The technique works by tricking a disassembler into decoding the instruction stream at an invalid offset. On CPU architectures with variable-length instructions, this often yields valid albeit meaningless assembly code, while hiding a part of the original code.

In the interest of furthering research into disassembly desynchronization, both from a defensive and offensive point of view, we have created desync-cc, a tool for automatic application of disassembly-desynchronization obfuscation. The tool is designed as a drop-in replacement for gcc, and works by intercepting and modifying intermediate assembly code during compilation. By applying obfuscation after the code generation phase, our tool allows a much more granular control over where obfuscation is applied, compared to a source-code level obfuscator. In this paper, we describe the design and implementation of desync-cc, and present a preliminary evaluation of its effectiveness and efficiency on a number of real-world Linux programs.

sted, utgiver, år, opplag, sider
IEEE Computer Society, 2022
Emneord
Disassembly desynchronization, Code obfuscation, Reverse engineering, x86 architecture
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-188915 (URN)10.1109/SANER53432.2022.00063 (DOI)000855050800051 ()9781665437868 (ISBN)
Konferanse
2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, Virtual Conference, March 15-18, 2022
Forskningsfinansiär
CUGS (National Graduate School in Computer Science)ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Tilgjengelig fra: 2022-09-30 Laget: 2022-09-30 Sist oppdatert: 2022-11-10bibliografisk kontrollert
Toczé, K., Schmitt, N., Kargén, U., Aral, A. & Brandic, I. (2022). Edge Workload Trace Gathering and Analysis for Benchmarking. In: Lena Mashayekhy, Stefan Schulte, Valeria Cardellini, Burak Kantarci, Yogesh Simmhan, Blesson Varghese (Ed.), 2022 IEEE 6th International Conference on Fog and Edge Computing (ICFEC): . Paper presented at 6th IEEE International Conference on Fog and Edge Computing (ICFEC), Taormina, ITALY, may 18-19, 2022 (pp. 34-41). IEEE
Åpne denne publikasjonen i ny fane eller vindu >>Edge Workload Trace Gathering and Analysis for Benchmarking
Vise andre…
2022 (engelsk)Inngår i: 2022 IEEE 6th International Conference on Fog and Edge Computing (ICFEC) / [ed] Lena Mashayekhy, Stefan Schulte, Valeria Cardellini, Burak Kantarci, Yogesh Simmhan, Blesson Varghese, IEEE , 2022, s. 34-41Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

The emerging field of edge computing is suffering from a lack of representative data to evaluate rapidly introduced new algorithms or techniques. That is a critical issue as this complex paradigm has numerous different use cases which translate into a highly diverse set of workload types.

In this work, within the context of the edge computing activity of SPEC RG Cloud, we continue working towards an edge benchmark by defining high-level workload classes as well as collecting and analyzing traces for three real-world edge applications, which, according to the existing literature, are the representatives of those classes. Moreover, we propose a practical and generic methodology for workload definition and gathering. The traces and gathering tool are provided open-source.

In the analysis of the collected workloads, we detect discrepancies between the literature and the traces obtained, thus highlighting the need for a continuing effort into gathering and providing data from real applications, which can be done using the proposed trace gathering methodology. Additionally, we discuss various insights and future directions that rise to the surface through our analysis.

sted, utgiver, år, opplag, sider
IEEE, 2022
Serie
Göta kanal – forskning från Linköpings universitet
Serie
IEEE International Conference on Fog and Edge Computing (ICFEC)
Emneord
Edge/fog workloads, Trace gathering, Edge/fog benchmarking, Open-source
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-187673 (URN)10.1109/ICFEC54809.2022.00012 (DOI)000850260200005 ()2-s2.0-85134067709 (Scopus ID)9781665495240 (ISBN)9781665495257 (ISBN)
Konferanse
6th IEEE International Conference on Fog and Edge Computing (ICFEC), Taormina, ITALY, may 18-19, 2022
Forskningsfinansiär
CUGS (National Graduate School in Computer Science)
Merknad

Funding: Swedish national graduate school in computer science (CUGS); CHIST-ERA grant [CHIST-ERA-19-CES-005]; Austrian Science Fund (FWF) [Y904-N31, I5201-N]

Tilgjengelig fra: 2022-08-19 Laget: 2022-08-19 Sist oppdatert: 2024-09-14
Mauthe, N., Kargén, U. & Shahmehri, N. (2021). A Large-Scale Empirical Study of Android App Decompilation. In: Cristina Ceballos (Ed.), 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering: . Paper presented at 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, Honolulu, HI, USA, 9-12 March, 2021 (pp. 400-410). Institute of Electrical and Electronics Engineers (IEEE)
Åpne denne publikasjonen i ny fane eller vindu >>A Large-Scale Empirical Study of Android App Decompilation
2021 (engelsk)Inngår i: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering / [ed] Cristina Ceballos, Institute of Electrical and Electronics Engineers (IEEE), 2021, s. 400-410Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Decompilers are indispensable tools in Android malware analysis and app security auditing. Numerous academic works also employ an Android decompiler as the first step in a program analysis pipeline. In such settings, decompilation is frequently regarded as a "solved" problem, in that it is simply expected that source code can be accurately recovered from an app. While a large proportion of methods in an app can typically be decompiled successfully, it is common that at least some methods fail to decompile. In order to better understand the practical applicability of techniques in which decompilation is used as part of an automated analysis, it is important to know the actual expected failure rate of Android decompilation. To this end, we have performed what is, to the best of our knowledge, the first large-scale study of Android decompilation failure rates. We have used three sets of apps, consisting of, respectively, 3,018 open-source apps, 13,601 apps from a recent crawl of Google Play, and a collection of 24,553 malware samples. In addition to the state-of-the-art Dalvik bytecode decompiler jadx, we used three popular Java decompilers. While jadx achieves an impressively low failure rate of only 0.02% failed methods per app on average, we found that it manages to recover source code for all methods in only 21% of the Google Play apps.We have also sought to better understand the degree to which in-the-wild obfuscation techniques can prevent decompilation. Our empirical evaluation, complemented with an indepth manual analysis of a number of apps, indicate that code obfuscation is quite rarely encountered, even in malicious apps. Moreover, decompilation failures mostly appear to be caused by technical limitations in decompilers, rather than by deliberate attempts to thwart source-code recovery by obfuscation. This is an encouraging finding, as it indicates that near-perfect Android decompilation is, at least in theory, achievable, with implementation-level improvements to decompilation tools.

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers (IEEE), 2021
Serie
2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), ISSN 1534-5351
Emneord
Android, mobile apps, decompilation, obfuscation, reverse engineering, malware
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-179270 (URN)10.1109/SANER50967.2021.00044 (DOI)000675825200035 ()2-s2.0-85106642902 (Scopus ID)9781728196305 (ISBN)9781728196312 (ISBN)
Konferanse
2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, Honolulu, HI, USA, 9-12 March, 2021
Merknad

Best paper award

Tilgjengelig fra: 2021-09-15 Laget: 2021-09-15 Sist oppdatert: 2021-11-12bibliografisk kontrollert
Mohammadinodooshan, A., Kargén, U. & Shahmehri, N. (2020). Comment on "AndrODet: An adaptive Android obfuscation detector".
Åpne denne publikasjonen i ny fane eller vindu >>Comment on "AndrODet: An adaptive Android obfuscation detector"
2020 (engelsk)Annet (Annet vitenskapelig)
Abstract [en]

We have identified a methodological problem in the empirical evaluation of the string encryption detection capabilities of the AndrODet system described by Mirzaei et al. in the recent paper "AndrODet: An adaptive Android obfuscation detector". The accuracy of string encryption detection is evaluated using samples from the AMD and PraGuard malware datasets. However, the authors failed to account for the fact that many of the AMD samples are highly similar due to the fact that they come from the same malware family. This introduces a risk that a machine learning system trained on these samples could fail to learn a generalizable model for string encryption detection, and might instead learn to classify samples based on characteristics of each malware family. Our own evaluation strongly indicates that the reported high accuracy of AndrODet's string encryption detection is indeed due to this phenomenon. When we evaluated AndrODet, we found that when we ensured that samples from the same family never appeared in both training and testing data, the accuracy dropped to around 50%. Moreover, the PraGuard dataset is not suitable for evaluating a static string encryption detector such as AndrODet, since the particular obfuscation tool used to produce the dataset effectively makes it impossible to extract meaningful features of static strings in Android apps.

Serie
arXiv.org ; 1910.06192v2
Emneord
AndrODet, Android, Malware, Obfuscation, String encryption, Machine learning
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-167212 (URN)
Prosjekter
Automated android malware analysis using machine learning
Tilgjengelig fra: 2020-06-29 Laget: 2020-06-29 Sist oppdatert: 2020-06-29
Mohammadinodooshan, A., Kargén, U. & Shahmehri, N. (2019). Robust Detection of Obfuscated Strings in Android Apps. In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security: . Paper presented at AISec'19: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security co-located with CCS'19: 2019 ACM SIGSAC Conference on Computer and Communications Security, London, United Kingdom, November 2019 (pp. 25-35). New York, NY, USA: Association for Computing Machinery (ACM), Article ID 42.
Åpne denne publikasjonen i ny fane eller vindu >>Robust Detection of Obfuscated Strings in Android Apps
2019 (engelsk)Inngår i: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, New York, NY, USA: Association for Computing Machinery (ACM), 2019, s. 25-35, artikkel-id 42Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

While string obfuscation is a common technique used by mobile developers to prevent reverse engineering of their apps, malware authors also often employ it to, for example, avoid detection by signature-based antivirus products. For this reason, robust techniques for detecting obfuscated strings in apps are an important step towards more effective means of combating obfuscated malware. In this paper, we discuss and empirically characterize four significant limitations of existing machine-learning approaches to string obfuscation detection, and propose a novel method to address these limitations. The key insight of our method is that discriminative classification methods, which try to fit a decision boundary based on a set of positive and negative samples, are inherently bound to generalize poorly when used for string obfuscation detection. Since many different string obfuscation techniques exist, both in the form of commercial tools and as custom implementations, it is close to impossible to construct a training set that is representative of all possible obfuscations. We instead propose a generative approach based on the Naive Bayes method. We first model the distribution of natural-language strings, using a large corpus of strings from 235 languages, and then base our classification on a measure of the confidence with which a language can be assigned to a string. Crucially, this allows us to completely eliminate the need for obfuscated training samples. In our experiments, this new method significantly outperformed both an n-gram based random forest classifier and an entropy-based classifier, in terms of accuracy and generalizability.

sted, utgiver, år, opplag, sider
New York, NY, USA: Association for Computing Machinery (ACM), 2019
Serie
Proceedings of the ACM Conference on Computer and Communications Security, ISSN 1543-7221
Emneord
Android, string obfuscation detection, string encryption, machine learning, generative models, malware
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-166612 (URN)10.1145/3338501.3357373 (DOI)2-s2.0-85075860536 (Scopus ID)978-1-4503-6833-9 (ISBN)
Konferanse
AISec'19: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security co-located with CCS'19: 2019 ACM SIGSAC Conference on Computer and Communications Security, London, United Kingdom, November 2019
Prosjekter
Automated android malware analysis using machine learning
Tilgjengelig fra: 2020-06-17 Laget: 2020-06-17 Sist oppdatert: 2021-07-15
Kargén, U. (2019). Scalable Dynamic Analysis of Binary Code. (Doctoral dissertation). Linköping: Linköping University Electronic Press
Åpne denne publikasjonen i ny fane eller vindu >>Scalable Dynamic Analysis of Binary Code
2019 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

In recent years, binary code analysis, i.e., applying program analysis directly at the machine code level, has become an increasingly important topic of study. This is driven to a large extent by the information security community, where security auditing of closed-source software and analysis of malware are important applications. Since most of the high-level semantics of the original source code are lost upon compilation to executable code, static analysis is intractable for, e.g., fine-grained information flow analysis of binary code. Dynamic analysis, however, does not suffer in the same way from reduced accuracy in the absence of high-level semantics, and is therefore also more readily applicable to binary code. Since fine-grained dynamic analysis often requires recording detailed information about every instruction execution, scalability can become a significant challenge. In this thesis, we address the scalability challenges of two powerful dynamic analysis methods whose widespread use has, so far, been impeded by their lack of scalability: dynamic slicing and instruction trace alignment. Dynamic slicing provides fine-grained information about dependencies between individual instructions, and can be used both as a powerful debugging aid and as a foundation for other dynamic analysis techniques. Instruction trace alignment provides a means for comparing executions of two similar programs and has important applications in, e.g., malware analysis, security auditing, and plagiarism detection. We also apply our work on scalable dynamic analysis in two novel approaches to improve fuzzing — a popular random testing technique that is widely used in industry to discover security vulnerabilities.

To use dynamic slicing, detailed information about a program execution must first be recorded. Since the amount of information is often too large to fit in main memory, existing dynamic slicing methods apply various time-versus-space trade-offs to reduce memory requirements. However, these trade-offs result in very high time overheads, limiting the usefulness of dynamic slicing in practice. In this thesis, we show that the speed of dynamic slicing can be greatly improved by carefully designing data structures and algorithms to exploit temporal locality of programs. This allows avoidance of the expensive trade-offs used in earlier methods by accessing recorded runtime information directly from secondary storage without significant random-access overhead. In addition to being a standalone contribution, scalable dynamic slicing also forms integral parts of our contributions to fuzzing. Our first contribution uses dynamic slicing and binary code mutation to automatically turn an existing executable into a test generator. In our experiments, this new approach to fuzzing achieved about an order of magnitude better code coverage than traditional mutational fuzzing and found several bugs in popular Linux software. The second work on fuzzing presented in this thesis uses dynamic slicing to accelerate the state-of-the-art fuzzer AFL by focusing the fuzzing effort on previously unexplored parts of the input space.

For the second dynamic analysis technique whose scalability we sought to improve — instruction trace alignment — we employed techniques used in speech recognition and information retrieval to design what is, to the best of our knowledge, the first general approach to aligning realistically long program traces. We show in our experiments that this method is capable of producing meaningful alignments even in the presence of significant syntactic differences stemming from, for example, the use of different compilers or optimization levels.

sted, utgiver, år, opplag, sider
Linköping: Linköping University Electronic Press, 2019. s. 73
Serie
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1993
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-157626 (URN)10.3384/diss.diva-157626 (DOI)9789176850497 (ISBN)
Disputas
2019-09-25, Planck, Hus F, Campus Valla, Linköping, 13:15 (engelsk)
Opponent
Veileder
Forskningsfinansiär
CUGS (National Graduate School in Computer Science)ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Tilgjengelig fra: 2019-08-22 Laget: 2019-08-16 Sist oppdatert: 2019-08-22bibliografisk kontrollert
Kargén, U. & Shahmehri, N. (2015). Turning programs against each other: high coverage fuzz-testing using binary-code mutation and dynamic slicing. In: 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS: . Paper presented at 10th Joint Meeting on Foundations of Software Engineering (pp. 782-792). New York, NY, USA: Association for Computing Machinery (ACM)
Åpne denne publikasjonen i ny fane eller vindu >>Turning programs against each other: high coverage fuzz-testing using binary-code mutation and dynamic slicing
2015 (engelsk)Inngår i: 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, New York, NY, USA: Association for Computing Machinery (ACM), 2015, s. 782-792Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Mutation-based fuzzing is a popular and widely employed black-box testing technique for finding security and robustness bugs in software. It owes much of its success to its simplicity; a well-formed seed input is mutated, e.g. through random bit-flipping, to produce test inputs. While reducing the need for human effort, and enabling security testing even of closed-source programs with undocumented input formats, the simplicity of mutation-based fuzzing comes at the cost of poor code coverage. Often millions of iterations are needed, and the results are highly dependent on configuration parameters and the choice of seed inputs. In this paper we propose a novel method for automated generation of high-coverage test cases for robustness testing. Our method is based on the observation that, even for closed-source programs with proprietary input formats, an implementation that can generate well-formed inputs to the program is typically available. By systematically mutating the program code of such generating programs, we leverage information about the input format encoded in the generating program to produce high-coverage test inputs, capable of reaching deep states in the program under test. Our method works entirely at the machine-code level, enabling use-cases similar to traditional black-box fuzzing. We have implemented the method in our tool MutaGen, and evaluated it on 7 popular Linux programs. We found that, for most programs, our method improves code coverage by one order of magnitude or more, compared to two well-known mutation-based fuzzers. We also found a total of 8 unique bugs.

sted, utgiver, år, opplag, sider
New York, NY, USA: Association for Computing Machinery (ACM), 2015
Emneord
Fuzz testing, fuzzing, black-box, dynamic slicing, program mutation
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-128810 (URN)10.1145/2786805.2786844 (DOI)000382568700067 ()978-1-4503-3675-8 (ISBN)
Konferanse
10th Joint Meeting on Foundations of Software Engineering
Tilgjengelig fra: 2016-05-31 Laget: 2016-05-31 Sist oppdatert: 2019-08-16
Kargén, U. & Shahmehri, N. (2014). Efficient Utilization of Secondary Storage for Scalable Dynamic Slicing. In: Randall Bilof (Ed.), Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation: . Paper presented at 14th IEEE International Working Conference on Source Code Analysis and Manipulation, Victoria, British Columbia, Canada, September 28-29, 2014 (pp. 155-164). IEEE
Åpne denne publikasjonen i ny fane eller vindu >>Efficient Utilization of Secondary Storage for Scalable Dynamic Slicing
2014 (engelsk)Inngår i: Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation / [ed] Randall Bilof, IEEE , 2014, s. 155-164Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Dynamic program slicing is widely recognized as a powerful aid for e.g. Program comprehension during debugging. However, its widespread use has been impeded in part by scalability issues that occur when constructing the dynamic dependence graph necessary to compute dynamic slices. A few seconds of execution time on a modern CPU can easily yield dynamic dependence graphs on the order of tens of gigabytes in size. Existing methods either produce imprecise slices, incur large time overheads during slice computation, or run out of memory for long program executions. By carefully designing our method to take advantage of locality, we are able to efficiently use secondary storage for dynamic dependence graphs, thus allowing our method to scale to long program executions. Our prototype implementation runs directly on x86 executables, eliminating problems with e.g. Binary-only libraries. We show in our experiments that graphs can be constructed for program runs with billions of executed instructions, at slowdowns ranging from 62x to 173x. Our optimized format also allows graphs to be traversed at speeds of several million dependence edges per second.

sted, utgiver, år, opplag, sider
IEEE, 2014
Emneord
binary analysis, debugging, dynamic dependence graph, dynamic slicing, x86
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-117289 (URN)10.1109/SCAM.2014.24 (DOI)000358876700020 ()978-0-7695-5304-7 (ISBN)
Konferanse
14th IEEE International Working Conference on Source Code Analysis and Manipulation, Victoria, British Columbia, Canada, September 28-29, 2014
Tilgjengelig fra: 2015-04-22 Laget: 2015-04-22 Sist oppdatert: 2019-08-16
Kargén, U. & Shahmehri, N. (2012). InputTracer: A Data-flow Analysis Tool for Manual Program Comprehension of x86 Binaries. In: Juan E. Guerrero (Ed.), Proceedings of the 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation: . Paper presented at 12th IEEE International Working Conference on Source Code Analysis and Manipulation, Riva del Garda, Trento, Italy, September 23-24, 2012 (pp. 138-143). IEEE
Åpne denne publikasjonen i ny fane eller vindu >>InputTracer: A Data-flow Analysis Tool for Manual Program Comprehension of x86 Binaries
2012 (engelsk)Inngår i: Proceedings of the 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation / [ed] Juan E. Guerrero, IEEE , 2012, s. 138-143Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Third-party security analysis of closed-source programs has become an important part of a defense-in-depth approach to software security for many companies. In the absence of efficient tools, the analysis has generally been performed through manual reverse engineering of the machine code. As reverse engineering is an extremely time-consuming and costly task, much research has been performed to develop more powerful methods for analysis of program binaries. One such popular method is dynamic taint analysis (DTA), which is a type of runtime data-flow analysis, where certain input data is marked as tainted. By tracking the flow of tainted data, DTA can, for instance, be used to determine which computations in a program are affected by a certain part of the input. In this paper we present InputTracer, a tool that utilizes DTA for aiding in manual program comprehension and analysis of unmodified x86 executables running in Linux. A brief overview of dynamic taint analysis is given, followed by a description of the tool and its implementation. We also demonstrate the tool’s ability to provide exact information on the origin of tainted data through a detailed use case, where the tool is used to find the root cause of a memory corruption bug.

sted, utgiver, år, opplag, sider
IEEE, 2012
Emneord
dynamic taint analysis, binary analysis, x86, program comprehension, Valgrind
HSV kategori
Identifikatorer
urn:nbn:se:liu:diva-80317 (URN)10.1109/SCAM.2012.16 (DOI)978-1-4673-2398-7 (ISBN)
Konferanse
12th IEEE International Working Conference on Source Code Analysis and Manipulation, Riva del Garda, Trento, Italy, September 23-24, 2012
Tilgjengelig fra: 2012-08-23 Laget: 2012-08-23 Sist oppdatert: 2018-01-12
Organisasjoner