2407.02856

Total: 1

#1 Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows [PDF] [Copy] [Kimi] [REL]

Authors: Adrian Pekar, Richard Jozsa

This study investigates the efficacy of machine learning models in network security threat detection through the critical lens of partial versus complete flow information, addressing a common gap between research settings and real-time operational needs. We systematically evaluate how a standard benchmark model, Random Forest, performs under varying training and testing conditions (complete/complete, partial/partial, complete/partial), quantifying the performance impact when dealing with the incomplete data typical in real-time environments. Our findings demonstrate a significant performance difference, with precision and recall dropping by up to 30% under certain conditions when models trained on complete flows are tested against partial flows. The study also reveals that, for the evaluated dataset and model, a minimum threshold around 7 packets in the test set appears necessary for maintaining reliable detection rates, providing valuable, quantified insights for developing more realistic real-time detection strategies.

Subjects: Machine Learning , Cryptography and Security

Publish: 2024-07-03 07:14:25 UTC