2502.14740

Total: 1

#1 YOLOv12: A Breakdown of the Key Architectural Features [PDF13] [Copy] [Kimi4] [REL]

Authors: Mujadded Al Rabbani Alif, Muhammad Hussain

This paper presents an architectural analysis of YOLOv12, a significant advancement in single-stage, real-time object detection building upon the strengths of its predecessors while introducing key improvements. The model incorporates an optimised backbone (R-ELAN), 7x7 separable convolutions, and FlashAttention-driven area-based attention, improving feature extraction, enhanced efficiency, and robust detections. With multiple model variants, similar to its predecessors, YOLOv12 offers scalable solutions for both latency-sensitive and high-accuracy applications. Experimental results manifest consistent gains in mean average precision (mAP) and inference speed, making YOLOv12 a compelling choice for applications in autonomous systems, security, and real-time analytics. By achieving an optimal balance between computational efficiency and performance, YOLOv12 sets a new benchmark for real-time computer vision, facilitating deployment across diverse hardware platforms, from edge devices to high-performance clusters.

Subjects: Computer Vision and Pattern Recognition , Artificial Intelligence

Publish: 2025-02-20 17:08:43 UTC