2505.07998

Total: 1

#1 Vision Foundation Model Embedding-Based Semantic Anomaly Detection [PDF2] [Copy] [Kimi] [REL]

Authors: Max Peter Ronecker, Matthew Foutter, Amine Elhafsi, Daniele Gammelli, Ihor Barakaiev, Marco Pavone, Daniel Watzenig

Semantic anomalies are contextually invalid or unusual combinations of familiar visual elements that can cause undefined behavior and failures in system-level reasoning for autonomous systems. This work explores semantic anomaly detection by leveraging the semantic priors of state-of-the-art vision foundation models, operating directly on the image. We propose a framework that compares local vision embeddings from runtime images to a database of nominal scenarios in which the autonomous system is deemed safe and performant. In this work, we consider two variants of the proposed framework: one using raw grid-based embeddings, and another leveraging instance segmentation for object-centric representations. To further improve robustness, we introduce a simple filtering mechanism to suppress false positives. Our evaluations on CARLA-simulated anomalies show that the instance-based method with filtering achieves performance comparable to GPT-4o, while providing precise anomaly localization. These results highlight the potential utility of vision embeddings from foundation models for real-time anomaly detection in autonomous systems.

Subjects: Computer Vision and Pattern Recognition , Machine Learning

Publish: 2025-05-12 19:00:29 UTC