26946@AAAI

Total: 1

#1 Lightweight Transformer for Multi-Modal Object Detection (Student Abstract) [PDF] [Copy] [Kimi]

Authors: Yue Cao ; Yanshuo Fan ; Junchi Bin ; Zheng Liu

It has become a common practice for many perceptual systems to integrate information from multiple sensors to improve the accuracy of object detection. For example, autonomous vehicles use visible light, and infrared (IR) information to ensure that the car can cope with complex weather conditions. However, the accuracy of the algorithm is usually a trade-off between the computational complexity and memory consumption. In this study, we evaluate the performance and complexity of different fusion operators in multi-modal object detection tasks. On top of that, a Poolformer-based fusion operator (PoolFuser) is proposed to enhance the accuracy of detecting targets without compromising the efficiency of the detection framework.