liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Shahmehri, Nahid, Professor
Publications (10 of 139) Show all publications
Kargén, U., Härnqvist, I., Wilson, J., Eriksson, G., Holmgren, E. & Shahmehri, N. (2022). desync-cc: An Automatic Disassembly-Desynchronization Obfuscator. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering: . Paper presented at 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, Virtual Conference, March 15-18, 2022 (pp. 464-468). IEEE Computer Society
Open this publication in new window or tab >>desync-cc: An Automatic Disassembly-Desynchronization Obfuscator
Show others...
2022 (English)In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, IEEE Computer Society, 2022, p. 464-468Conference paper, Published paper (Refereed)
Abstract [en]

Code obfuscation is an important topic, both in terms of defense, when trying to prevent intellectual property theft, and from the offensive point of view, when trying to break obfuscation used by malware authors to hide their malicious intents. Consequently, several works in recent years have discussed techniques that aim to prevent or delay reverse-engineering of binaries. While most works focus on methods that obscure the program logic from potential attackers, the complimentary approach of disassembly desynchronization has received relatively little attention. This technique puts another hurdle in the way of attackers by targeting the most fundamental step of the reverse-engineering process: recovering assembly code from a program binary. The technique works by tricking a disassembler into decoding the instruction stream at an invalid offset. On CPU architectures with variable-length instructions, this often yields valid albeit meaningless assembly code, while hiding a part of the original code.

In the interest of furthering research into disassembly desynchronization, both from a defensive and offensive point of view, we have created desync-cc, a tool for automatic application of disassembly-desynchronization obfuscation. The tool is designed as a drop-in replacement for gcc, and works by intercepting and modifying intermediate assembly code during compilation. By applying obfuscation after the code generation phase, our tool allows a much more granular control over where obfuscation is applied, compared to a source-code level obfuscator. In this paper, we describe the design and implementation of desync-cc, and present a preliminary evaluation of its effectiveness and efficiency on a number of real-world Linux programs.

Place, publisher, year, edition, pages
IEEE Computer Society, 2022
Keywords
Disassembly desynchronization, Code obfuscation, Reverse engineering, x86 architecture
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-188915 (URN)10.1109/SANER53432.2022.00063 (DOI)000855050800051 ()9781665437868 (ISBN)
Conference
2022 IEEE International Conference on Software Analysis, Evolution and Reengineering, Virtual Conference, March 15-18, 2022
Funder
CUGS (National Graduate School in Computer Science)ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Available from: 2022-09-30 Created: 2022-09-30 Last updated: 2022-11-10Bibliographically approved
Mauthe, N., Kargén, U. & Shahmehri, N. (2021). A Large-Scale Empirical Study of Android App Decompilation. In: Cristina Ceballos (Ed.), 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering: . Paper presented at 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, Honolulu, HI, USA, 9-12 March, 2021 (pp. 400-410). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>A Large-Scale Empirical Study of Android App Decompilation
2021 (English)In: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering / [ed] Cristina Ceballos, Institute of Electrical and Electronics Engineers (IEEE), 2021, p. 400-410Conference paper, Published paper (Refereed)
Abstract [en]

Decompilers are indispensable tools in Android malware analysis and app security auditing. Numerous academic works also employ an Android decompiler as the first step in a program analysis pipeline. In such settings, decompilation is frequently regarded as a "solved" problem, in that it is simply expected that source code can be accurately recovered from an app. While a large proportion of methods in an app can typically be decompiled successfully, it is common that at least some methods fail to decompile. In order to better understand the practical applicability of techniques in which decompilation is used as part of an automated analysis, it is important to know the actual expected failure rate of Android decompilation. To this end, we have performed what is, to the best of our knowledge, the first large-scale study of Android decompilation failure rates. We have used three sets of apps, consisting of, respectively, 3,018 open-source apps, 13,601 apps from a recent crawl of Google Play, and a collection of 24,553 malware samples. In addition to the state-of-the-art Dalvik bytecode decompiler jadx, we used three popular Java decompilers. While jadx achieves an impressively low failure rate of only 0.02% failed methods per app on average, we found that it manages to recover source code for all methods in only 21% of the Google Play apps.We have also sought to better understand the degree to which in-the-wild obfuscation techniques can prevent decompilation. Our empirical evaluation, complemented with an indepth manual analysis of a number of apps, indicate that code obfuscation is quite rarely encountered, even in malicious apps. Moreover, decompilation failures mostly appear to be caused by technical limitations in decompilers, rather than by deliberate attempts to thwart source-code recovery by obfuscation. This is an encouraging finding, as it indicates that near-perfect Android decompilation is, at least in theory, achievable, with implementation-level improvements to decompilation tools.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
Series
2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), ISSN 1534-5351
Keywords
Android, mobile apps, decompilation, obfuscation, reverse engineering, malware
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-179270 (URN)10.1109/SANER50967.2021.00044 (DOI)000675825200035 ()2-s2.0-85106642902 (Scopus ID)9781728196305 (ISBN)9781728196312 (ISBN)
Conference
2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, Honolulu, HI, USA, 9-12 March, 2021
Note

Best paper award

Available from: 2021-09-15 Created: 2021-09-15 Last updated: 2021-11-12Bibliographically approved
Mohammadinodooshan, A., Kargén, U. & Shahmehri, N. (2020). Comment on "AndrODet: An adaptive Android obfuscation detector".
Open this publication in new window or tab >>Comment on "AndrODet: An adaptive Android obfuscation detector"
2020 (English)Other (Other academic)
Abstract [en]

We have identified a methodological problem in the empirical evaluation of the string encryption detection capabilities of the AndrODet system described by Mirzaei et al. in the recent paper "AndrODet: An adaptive Android obfuscation detector". The accuracy of string encryption detection is evaluated using samples from the AMD and PraGuard malware datasets. However, the authors failed to account for the fact that many of the AMD samples are highly similar due to the fact that they come from the same malware family. This introduces a risk that a machine learning system trained on these samples could fail to learn a generalizable model for string encryption detection, and might instead learn to classify samples based on characteristics of each malware family. Our own evaluation strongly indicates that the reported high accuracy of AndrODet's string encryption detection is indeed due to this phenomenon. When we evaluated AndrODet, we found that when we ensured that samples from the same family never appeared in both training and testing data, the accuracy dropped to around 50%. Moreover, the PraGuard dataset is not suitable for evaluating a static string encryption detector such as AndrODet, since the particular obfuscation tool used to produce the dataset effectively makes it impossible to extract meaningful features of static strings in Android apps.

Series
arXiv.org ; 1910.06192v2
Keywords
AndrODet, Android, Malware, Obfuscation, String encryption, Machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-167212 (URN)
Projects
Automated android malware analysis using machine learning
Available from: 2020-06-29 Created: 2020-06-29 Last updated: 2020-06-29
Mohammadinodooshan, A., Kargén, U. & Shahmehri, N. (2019). Robust Detection of Obfuscated Strings in Android Apps. In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security: . Paper presented at AISec'19: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security co-located with CCS'19: 2019 ACM SIGSAC Conference on Computer and Communications Security, London, United Kingdom, November 2019 (pp. 25-35). New York, NY, USA: Association for Computing Machinery (ACM), Article ID 42.
Open this publication in new window or tab >>Robust Detection of Obfuscated Strings in Android Apps
2019 (English)In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, New York, NY, USA: Association for Computing Machinery (ACM), 2019, p. 25-35, article id 42Conference paper, Published paper (Refereed)
Abstract [en]

While string obfuscation is a common technique used by mobile developers to prevent reverse engineering of their apps, malware authors also often employ it to, for example, avoid detection by signature-based antivirus products. For this reason, robust techniques for detecting obfuscated strings in apps are an important step towards more effective means of combating obfuscated malware. In this paper, we discuss and empirically characterize four significant limitations of existing machine-learning approaches to string obfuscation detection, and propose a novel method to address these limitations. The key insight of our method is that discriminative classification methods, which try to fit a decision boundary based on a set of positive and negative samples, are inherently bound to generalize poorly when used for string obfuscation detection. Since many different string obfuscation techniques exist, both in the form of commercial tools and as custom implementations, it is close to impossible to construct a training set that is representative of all possible obfuscations. We instead propose a generative approach based on the Naive Bayes method. We first model the distribution of natural-language strings, using a large corpus of strings from 235 languages, and then base our classification on a measure of the confidence with which a language can be assigned to a string. Crucially, this allows us to completely eliminate the need for obfuscated training samples. In our experiments, this new method significantly outperformed both an n-gram based random forest classifier and an entropy-based classifier, in terms of accuracy and generalizability.

Place, publisher, year, edition, pages
New York, NY, USA: Association for Computing Machinery (ACM), 2019
Series
Proceedings of the ACM Conference on Computer and Communications Security, ISSN 1543-7221
Keywords
Android, string obfuscation detection, string encryption, machine learning, generative models, malware
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-166612 (URN)10.1145/3338501.3357373 (DOI)2-s2.0-85075860536 (Scopus ID)978-1-4503-6833-9 (ISBN)
Conference
AISec'19: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security co-located with CCS'19: 2019 ACM SIGSAC Conference on Computer and Communications Security, London, United Kingdom, November 2019
Projects
Automated android malware analysis using machine learning
Available from: 2020-06-17 Created: 2020-06-17 Last updated: 2021-07-15
Vapen, A., Carlsson, N., Mahanti, A. & Shahmehri, N. (2016). A Look at the Third-Party Identity Management Landscape. IEEE Internet Computing, 20(2), 18-25
Open this publication in new window or tab >>A Look at the Third-Party Identity Management Landscape
2016 (English)In: IEEE Internet Computing, ISSN 1089-7801, E-ISSN 1941-0131, Vol. 20, no 2, p. 18-25Article in journal (Refereed) Published
Abstract [en]

Many websites act as relying parties (RPs) by allowing access to their services via third-party identity providers (IDPs), such as Facebook and Google. Using IDPs simplifies account creation, login activity, and information sharing across websites. However, different websites use of IDPs can have significant security and privacy implications for users. Here, the authors provide an overview of third-party identity managements current landscape. Using datasets collected through manual identification and large-scale crawling, they answer questions related to which sites act as RPs, which sites are the most successful IDPs, and how different classes of RPs select their IDPs.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2016
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:liu:diva-127053 (URN)10.1109/MIC.2016.38 (DOI)000372015500003 ()
Available from: 2016-04-13 Created: 2016-04-13 Last updated: 2021-04-26
Hiran, R., Carlsson, N. & Shahmehri, N. (2016). Does Scale, Size, and Locality Matter?: Evaluation of Collaborative BGP Security Mechanisms. In: 2016 IFIP NETWORKING CONFERENCE (IFIP NETWORKING) AND WORKSHOPS: . Paper presented at IFIP Networking Conference (IFIP Networking) and Workshops, Vienna, Austria, May 2016 (pp. 261-269). IEEE
Open this publication in new window or tab >>Does Scale, Size, and Locality Matter?: Evaluation of Collaborative BGP Security Mechanisms
2016 (English)In: 2016 IFIP NETWORKING CONFERENCE (IFIP NETWORKING) AND WORKSHOPS, IEEE , 2016, p. 261-269Conference paper, Published paper (Refereed)
Abstract [en]

The Border Gateway Protocol (BGP) was not designed with security in mind and is vulnerable to many attacks, including prefix/subprefix hijacks, interception attacks, and imposture attacks. Despite many protocols having been proposed to detect or prevent such attacks, no solution has been widely deployed. Yet, the effectiveness of most proposals relies on largescale adoption and cooperation between many large Autonomous Systems (AS). In this paper we use measurement data to evaluate some promising, previously proposed techniques in cases where they are implemented by different subsets of ASes, and answer questions regarding which ASes need to collaborate, the importance of the locality and size of the participating ASes, and how many ASes are needed to achieve good efficiency when different subsets of ASes collaborate. For our evaluation we use topologies and routing information derived from real measurement data. We consider collaborative detection and prevention techniques that use (i) prefix origin information, (ii) route path updates, or (iii) passively collected round-trip time (RTT) information. Our results and answers to the above questions help determine the effectiveness of potential incremental rollouts, incentivized or required by regional legislation, for example. While there are differences between the techniques and two of the three classes see the biggest benefits when detection/prevention is performed close to the source of an attack, the results show that significant gains can be achieved even with only regional collaboration.

Place, publisher, year, edition, pages
IEEE, 2016
National Category
Computer Sciences Communication Systems
Identifiers
urn:nbn:se:liu:diva-129430 (URN)10.1109/IFIPNetworking.2016.7497237 (DOI)000383224900030 ()978-3-9018-8283-8 (ISBN)
Conference
IFIP Networking Conference (IFIP Networking) and Workshops, Vienna, Austria, May 2016
Available from: 2016-06-19 Created: 2016-06-19 Last updated: 2021-04-26
Vapen, A., Carlsson, N. & Shahmehri, N. (2016). Longitudinal Analysis of the Third-party Authentication Landscape. In: : . Paper presented at NDSS Workshop on Understanding and Enhancing Online Privacy Workshop (UEOP@NDSS).21-24 February 2016 Catamaran Resort Hotel & Spa in San Diego, California. Internet Society
Open this publication in new window or tab >>Longitudinal Analysis of the Third-party Authentication Landscape
2016 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Many modern websites offer single sign-on (SSO) services, which allow the user to use an existing account with a third-party website such as Facebook to authenticate. When using SSO the user must approve an app-rights agreement that specifies what data related to the user can be shared between the two websites and any actions (e.g., posting comments) that the origin website is allowed to perform on behalf of the user on the third-party provider (e.g., Facebook). Both cross-site data sharing and actions performed on behalf of the user can have significant privacy implications. In this paper we present a longitudinal study of the third-party authentication landscape, its structure, and the protocol usage, data sharing, and actions associated with individual third-party relationships. The study captures the current state, changes in the structure, protocol usage, and information leakage risks.

Place, publisher, year, edition, pages
Internet Society, 2016
National Category
Computer Systems
Identifiers
urn:nbn:se:liu:diva-127301 (URN)1-891562-44-4 (ISBN)
Conference
NDSS Workshop on Understanding and Enhancing Online Privacy Workshop (UEOP@NDSS).21-24 February 2016 Catamaran Resort Hotel & Spa in San Diego, California
Note

DOI does not work: 10.14722/ueop.2016.23008

Available from: 2016-04-19 Created: 2016-04-19 Last updated: 2021-04-26Bibliographically approved
Krishnamoorthi, V., Carlsson, N., Eager, D., Mahanti, A. & Shahmehri, N. (2015). Bandwidth-aware Prefetching for Proactive Multi-video Preloading and Improved HAS Performance. In: Proceedings of the ACM International Conference on Multimedia (ACM Multimedia): . Paper presented at ACM Multimedia 2015 (pp. 551-560). New York, USA: Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Bandwidth-aware Prefetching for Proactive Multi-video Preloading and Improved HAS Performance
Show others...
2015 (English)In: Proceedings of the ACM International Conference on Multimedia (ACM Multimedia), New York, USA: Association for Computing Machinery (ACM), 2015, p. 551-560Conference paper, Published paper (Refereed)
Abstract [en]

This paper considers the problem of providing users playing one streaming video the option of instantaneous and seamless playback of alternative videos. Recommendation systems can easily provide a list of alternative videos, but there is little research on how to best eliminate the startup time for these alternative videos. The problem is motivated by services that want to retain increasingly impatient users, who frequently watch the beginning of multiple videos, before viewing a video to the end. We present the design, implementation, and evaluation of an HTTP-based Adaptive Streaming (HAS) solution that provides careful prefetching and buffer management. We also present the design and evaluation of three fundamental policy classes that provide different tradeoffs between how aggressively new alternative videos are prefetched versus the importance of ensuring high playback quality. We show that our solution allows us to reduce the startup times of alternative videos by an order of magnitude and effectively adapt the quality such as to ensure the highest possible playback quality of the video being viewed. By improving the channel utilization we also address the discrimination problem that HAS clients often suffer from, allowing us to in some cases simultaneously improve the playback quality of the video being viewed and provide the value-added service of allowing instantaneous playback of the prefetched alternative videos.

Place, publisher, year, edition, pages
New York, USA: Association for Computing Machinery (ACM), 2015
Keywords
HTTP-based adaptive streaming (HAS); Bandwidth-aware prefetching; Multi-video preloading; Seamless playback
National Category
Computer Systems
Identifiers
urn:nbn:se:liu:diva-128168 (URN)10.1145/2733373.2806270 (DOI)000387861300064 ()978-1-4503-3459-4 (ISBN)
Conference
ACM Multimedia 2015
Available from: 2016-05-20 Created: 2016-05-19 Last updated: 2021-04-26
Hiran, R., Carlsson, N. & Shahmehri, N. (2015). Crowd-based Detection of Routing Anomalies on the Internet. In: Proc. IEEE Conference on Communications and Network Security (IEEE CNS), Florence, Italy, Sept. 2015.: . Paper presented at Proc. IEEE Conference on Communications and Network Security (IEEE CNS), Florence, Italy, Sept. 2015. (pp. 388-396). IEEE Computer Society Digital Library
Open this publication in new window or tab >>Crowd-based Detection of Routing Anomalies on the Internet
2015 (English)In: Proc. IEEE Conference on Communications and Network Security (IEEE CNS), Florence, Italy, Sept. 2015., IEEE Computer Society Digital Library, 2015, p. 388-396Conference paper, Published paper (Refereed)
Abstract [en]

The Internet is highly susceptible to routing attacks and there is no universally deployed solution that ensures that traffic is not hijacked by third parties. Individuals or organizations wanting to protect themselves from sustained attacks must therefore typically rely on measurements and traffic monitoring to detect attacks. Motivated by the high overhead costs of continuous active measurements, we argue that passive monitoring combined with collaborative information sharing and statistics can be used to provide alerts about traffic anomalies that may require further investigation. In this paper we present and evaluate a user-centric crowd-based approach in which users passively monitor their network traffic, share information about potential anomalies, and apply combined collaborative statistics to identify potential routing anomalies. The approach uses only passively collected round-trip time (RTT) measurements, is shown to have low overhead, regardless if a central or distributed architecture is used, and provides an attractive tradeoff between attack detection rates (when there is an attack) and false alert rates (needing further investigation) under normal conditions. Our data-driven analysis using longitudinal and distributed RTT measurements also provides insights into detector selection and the relative weight that should be given to candidate detectors at different distances from the potential victim node.

Place, publisher, year, edition, pages
IEEE Computer Society Digital Library, 2015
National Category
Communication Systems
Identifiers
urn:nbn:se:liu:diva-129426 (URN)10.1109/CNS.2015.7346850 (DOI)000380401800048 ()978-1-4673-7876-5 (ISBN)
Conference
Proc. IEEE Conference on Communications and Network Security (IEEE CNS), Florence, Italy, Sept. 2015.
Available from: 2016-06-19 Created: 2016-06-19 Last updated: 2021-04-26
Byers, D. & Shahmehri, N. (2015). Graphical Modeling of Security Goals and Software Vulnerabilities. In: Vicente García Díaz, Juan Manuel Cueva Lovelle, B. Cristina Pelayo García-Bustelo (Ed.), Handbook of Research on Innovations in Systems and Software Engineering: (pp. 1-31). IGI Global
Open this publication in new window or tab >>Graphical Modeling of Security Goals and Software Vulnerabilities
2015 (English)In: Handbook of Research on Innovations in Systems and Software Engineering / [ed] Vicente García Díaz, Juan Manuel Cueva Lovelle, B. Cristina Pelayo García-Bustelo, IGI Global, 2015, p. 1-31Chapter in book (Refereed)
Abstract [en]

Security has become recognized as a critical aspect of software development, leading to the development of various security-enhancing techniques, many of which use some kind of custom modeling language. Models in different languages cannot readily be related to each other, which is an obstacle to using several techniques together. The sheer number of languages is, in itself, also an obstacle to adoption by developers. The authors have developed a modeling language that can be used in place of four existing modeling languages: attack trees, vulnerability cause graphs, security activity graphs, and security goal indicator trees. Models in the new language can be transformed to and from the earlier language, and a precise definition of model semantics enables an even wider range of applications, such as testing and static analysis. This chapter explores this new language.

Place, publisher, year, edition, pages
IGI Global, 2015
Keywords
Software security, Software vulnerability, Security goal modelling, Secure software engineering
National Category
Software Engineering
Identifiers
urn:nbn:se:liu:diva-117722 (URN)10.4018/978-1-4666-6359-6.ch001 (DOI)978-146666-359-6 (ISBN)1-46666359-6 (ISBN)978-14-6666-360-2 (ISBN)
Available from: 2015-05-07 Created: 2015-05-07 Last updated: 2018-01-11
Organisations

Search in DiVA

Show all publications