Automatic Detection, unpacking of untagged compressed data
2026 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesisAlternative title
Automatisk detektion, uppacking av otaggad komprimerad data (Swedish)
Abstract [en]
Modern digital systems rely heavily on firmware updates that are frequently distributed as compressed binary blobs. In forensic investigations and security audits, these blobs often appear withoutfile headers or metadata, rendering standard signature-based extraction tools ineffective. This thesispresents BinSift, a modular Python-based framework designed for the automatic detection, classification, and “blind” decompression of untagged compressed data.To calibrate the system, a large-scale statistical analysis was conducted on the FirmSec dataset, profiling approximately 34,136 firmware images totaling over 200 GB of binary data. Results indicatethat an average Shannon entropy threshold of 7.1 bits per byte provides an optimal balance for capturing modern compression formats like LZMA and SquashFS while minimizing false positives fromhigh-density uncompressed code.The BinSift framework was evaluated against industrial firmware samples, achieving a 59.0% successrate in “True Blind” mode without any prior knowledge of file headers. This approach maintained an81.5% fidelity retention compared to metadata-assisted baselines. When excluding mathematicallyunrecoverable encrypted payloads, the effective success rate rose to 84.4%. These findings demonstrate that entropy-based stream identification and bit-level refinement are viable solutions for bypassing obfuscation in embedded systems forensics.
Place, publisher, year, edition, pages
2026. , p. 66
Keywords [en]
Firmware Forensics, Blind Decompression, Shannon Entropy, Embedded Systems Security, Binary Blob Analysis, Signatureless Extraction, Heuristic Stream Detection, Reverse Engineering
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:liu:diva-224113ISRN: LITH-EX-A--26/018--SEOAI: oai:DiVA.org:liu-224113DiVA, id: diva2:2060854
Presentation
2026-05-13, Charles Babbage, Linköping, 14:15 (English)
Supervisors
Examiners
2026-05-272026-05-192026-05-27Bibliographically approved