Better methods and tools are needed in the fight against child pornography. This thesis presents a method for file type categorisation of unknown data fragments, a method for reassembly of JPEG fragments, and the requirements put on an artificial JPEG header for viewing reassembled images. To enable empirical evaluation of the methods a number of tools based on the methods have been implemented.
The file type categorisation method identifies JPEG fragments with a detection rate of 100% and a false positives rate of 0.1%. The method uses three algorithms, Byte Frequency Distribution (BFD), Rate of Change (RoC), and 2-grams. The algorithms are designed for different situations, depending on the requirements at hand.
The reconnection method correctly reconnects 97% of a Restart (RST) marker enabled JPEG image, fragmented into 4 KiB large pieces. When dealing with fragments from several images at once, the method is able to correctly connect 70% of the fragments at the first iteration.
Two parameters in a JPEG header are crucial to the quality of the image; the size of the image and the sampling factor (actually factors) of the image. The size can be found using brute force and the sampling factors only take on three different values. Hence it is possible to use an artificial JPEG header to view full of parts of an image. The only requirement is that the fragments contain RST markers.
The results of the evaluations of the methods show that it is possible to find, reassemble, and view JPEG image fragments with high certainty.