liu.seSearch for publications in DiVA
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Studying the effectiveness of dynamic analysis for fingerprinting Android malware behavior
Linköping University, Department of Computer and Information Science, Database and information techniques.
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
En studie av effektivitet hos dynamisk analys för kartläggning av beteenden hos Android malware (Swedish)
Abstract [en]

Android is the second most targeted operating system for malware authors and to counter the development of Android malware, more knowledge about their behavior is needed. There are mainly two approaches to analyze Android malware, namely static and dynamic analysis. Recently in 2017, a study and well labeled dataset, named AMD (Android Malware Dataset), consisting of over 24,000 malware samples was released. It is divided into 135 varieties based on similar malicious behavior, retrieved through static analysis of the file classes.dex in the APK of each malware, whereas the labeled features were determined by manual inspection of three samples in each variety. However, static analysis is known to be weak against obfuscation techniques, such as repackaging or dynamic loading, which can be exploited to avoid the analysis. In this study the second approach is utilized and all malware in the dataset are analyzed at run-time in order to monitor their dynamic behavior. However, analyzing malware at run-time has known weaknesses as well, as it can be avoided through, for instance, anti-emulator techniques. Therefore, the study aimed to explore the available sandbox environments for dynamic analysis, study the effectiveness of fingerprinting Android malware using one of the tools and investigate whether static features from AMD and the dynamic analysis correlate. For instance, by an attempt to classify the samples based on similar dynamic features and calculating the Pearson Correlation Coefficient (r) for all combinations of features from AMD and the dynamic analysis.

The comparison of tools for dynamic analysis, showed a need of development, as most popular tools has been released for a long time and the common factor is a lack of continuous maintenance. As a result, the choice of sandbox environment for this study ended up as Droidbox, because of aspects like ease of use/install and easily adaptable for large scale analysis. Based on the dynamic features extracted with Droidbox, it could be shown that Android malware are more similar to the varieties which they belong to. The best metric for classifying samples to varieties, out of four investigated metrics, turned out to be Cosine Similarity, which received an accuracy of 83.6% for the entire dataset. The high accuracy indicated a correlation between the dynamic features and static features which the varieties are based on. Furthermore, the Pearson Correlation Coefficient confirmed that the manually extracted features, used to describe the varieties, and the dynamic features are correlated to some extent, which could be partially confirmed by a manual inspection in the end of the study.

Place, publisher, year, edition, pages
2019. , p. 46
Keywords [en]
Android malware, dynamic analysis, droidbox, cuckoodroid, droidscope, mobsf, malware behavior, correlation, pearson correlation, cosine similarity, euclidean distance, chebyshev distance, mahalanobis distance, similarity analysis, static features, dynamic features, tf-idf, term frequency inverse document frequency, AMD, Android malware dataset, malware dataset, UpDroid, EC2
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:liu:diva-163090ISRN: LIU-IDA/LITH-EX-A--19/104--SEOAI: oai:DiVA.org:liu-163090DiVA, id: diva2:1384850
Subject / course
Computer Engineering
Supervisors
Examiners
Available from: 2020-01-13 Created: 2020-01-11 Last updated: 2020-01-13Bibliographically approved

Open Access in DiVA

fulltext(380 kB)31 downloads
File information
File name FULLTEXT01.pdfFile size 380 kBChecksum SHA-512
43e62d76cfe789eeaac93661194bcf5b77c634507f55665babcc50798fd0b379559502fba1dae9c467c1d73d57f6052def32de270c77a7b7b241ac608dafebfc
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Regard, Viktor
By organisation
Database and information techniques
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 31 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 161 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf