On Generating Network Traffic Datasets with Synthetic Attacks for Intrusion DetectionShow others and affiliations
2021 (English)In: ACM TRANSACTIONS ON PRIVACY AND SECURITY, ISSN 2471-2566, Vol. 24, no 2, article id 8Article in journal (Refereed) Published
Abstract [en]
Most research in the field of network intrusion detection heavily relies on datasets. Datasets in this field, however, are scarce and difficult to reproduce. To compare, evaluate, and test related work, researchers usually need the same datasets or at least datasets with similar characteristics as the ones used in related work. In this work, we present concepts and the Intrusion Detection Dataset Toolkit (ID2T) to alleviate the problem of reproducing datasets with desired characteristics to enable an accurate replication of scientific results. Intrusion Detection Dataset Toolkit (ID2T) facilitates the creation of labeled datasets by injecting synthetic attacks into background traffic. The injected synthetic attacks created by ID2T blend with the background traffic by mimicking the background traffics properties. This article has three core contributions. First, we present a comprehensive survey on intrusion detection datasets. In the survey, we propose a classification to group the negative qualities found in the datasets. Second, the architecture of ID2T is revised, improved, arid expanded in comparison to previous work. The architectural changes enable ID2T to inject recent and advanced attacks, such as the EternalBlue exploit or a peer-to-peer botnet. ID2Ts functionality provides a set of tests, known as TIDED, that helps identify potential defects in the background traffic into which attacks are injected. Third, we illustrate how ID2T is used in different use-case scenarios to replicate scientific results with the help of reproducible datasets. ID2T is open source software and is made available to the community to expand its arsenal of attacks and capabilities.
Place, publisher, year, edition, pages
ASSOC COMPUTING MACHINERY , 2021. Vol. 24, no 2, article id 8
Keywords [en]
Intrusion detection systems; datasets; attack injection; synthetic dataset
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:liu:diva-174142DOI: 10.1145/3424155ISI: 000618201200002OAI: oai:DiVA.org:liu-174142DiVA, id: diva2:1537231
Note
Funding Agencies|German Federal Ministry of Education and Research within National Research Center for Applied Cybersecurity ATHENE; Hessen State Ministry for Higher Education, Research and the Arts within National Research Center for Applied Cybersecurity ATHENE; Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)German Research Foundation (DFG) [251805230/GRK 2050]; research centre on Resilient Information and Control Systems - Swedish Civil Contingencies Agency (MSB)
2021-03-152021-03-152025-08-21Bibliographically approved