2601.01698

Total: 1

#1 Hidden costs for inference with deep network on embedded system devices [PDF] [Copy] [Kimi] [REL]

Authors: Chankyu Lee, Woohyun Choi, Sangwook Park

This study evaluates the inference performance of various deep learning models under an embedded system environment. In previous works, Multiply-Accumulate operation is typically used to measure computational load of a deep model. According to this study, however, this metric has a limitation to estimate inference time on embedded devices. This paper poses the question of what aspects are overlooked when expressed in terms of Multiply-Accumulate operations. In experiments, an image classification task is performed on an embedded system device using the CIFAR-100 dataset to compare and analyze the inference times of ten deep models with the theoretically calculated Multiply-Accumulate operations for each model. The results highlight the importance of considering additional computations between tensors when optimizing deep learning models for real-time performing in embedded systems.

Subjects: Computational Complexity , Machine Learning

Publish: 2026-01-05 00:18:51 UTC