2507.06430

Total: 1

#1 One task to rule them all: A closer look at traffic classification generalizability [PDF] [Copy] [Kimi] [REL]

Authors: Elham Akbari, Zihao Zhou, Mohammad Ali Salahuddin, Noura Limam, Raouf Boutaba, Bertrand Mathieu, Stephanie Moteau, Stephane Tuffin

Existing website fingerprinting and traffic classification solutions do not work well when the evaluation context changes, as their performances often heavily rely on context-specific assumptions. To clarify this problem, we take three prior solutions presented for different but similar traffic classification and website fingerprinting tasks, and apply each solution's model to another solution's dataset. We pinpoint dataset-specific and model-specific properties that lead each of them to overperform in their specific evaluation context. As a realistic evaluation context that takes practical labeling constraints into account, we design an evaluation framework using two recent real-world TLS traffic datasets from large-scale networks. The framework simulates a futuristic scenario in which SNIs are hidden in some networks but not in others, and the classifier's goal is to predict destination services in one network's traffic, having been trained on a labelled dataset collected from a different network. Our framework has the distinction of including real-world distribution shift, while excluding concept drift. We show that, even when abundant labeled data is available, the best solutions' performances under distribution shift are between 30% and 40%, and a simple 1-Nearest Neighbor classifier's performance is not far behind. We depict all performances measured on different models, not just the best ones, for a fair representation of traffic models in practice.

Subject: Networking and Internet Architecture

Publish: 2025-07-08 22:22:50 UTC