Timely and accurate flow classification is important for identifying flows with different service requirements, optimized network management, and for helping network operators simultaneously operate networks at higher utilization while providing end users good quality of experience (QoE). With most services starting to use end-to-end encryption (HTTPS and QUIC), traditional Deep Packet Inspection (DPI) and port-based approaches are no longer applicable. Furthermore, most flow-level-based approaches ignore the complex non-linear characteristics of internet traffic (e.g., self similarity). To address this challenge, in this paper, we present and evaluate a classification framework that combines multi-fractal feature extraction based on time series data (which captures these non-linear characteristics), principal component analysis (PCA) based feature selection, and man-in-the-middle (MITM) based flow labeling. Our detailed evaluation shows that the method is able to quickly and effectively classify traffic belonging to the six most popular traffic types (video streaming, web browsing, social networking, audio communication, text communication, and bulk download) and to distinguish between video-on-demand (VoD) and live streaming sessions delivered from the same services. Our results show that good accuracy can be achieved with only information about the timing of the packets within a flow.
Funding Agencies|Swedish Research Council (VR)Swedish Research Council