2504.03451

Total: 1

#1 NDFT: Accelerating Density Functional Theory Calculations via Hardware/Software Co-Design on Near-Data Computing System [PDF1] [Copy] [Kimi] [REL]

Authors: Qingcai Jiang, Buxin Tu, Xiaoyu Hao, Junshi Chen, Hong An

Linear-response time-dependent Density Functional Theory (LR-TDDFT) is a widely used method for accurately predicting the excited-state properties of physical systems. Previous works have attempted to accelerate LR-TDDFT using heterogeneous systems such as GPUs, FPGAs, and the Sunway architecture. However, a major drawback of these approaches is the constant data movement between host memory and the memory of the heterogeneous systems, which results in substantial \textit{data movement overhead}. Moreover, these works focus primarily on optimizing the compute-intensive portions of LR-TDDFT, despite the fact that the calculation steps are fundamentally \textit{memory-bound}. To address these challenges, we propose NDFT, a \underline{N}ear-\underline{D}ata Density \underline{F}unctional \underline{T}heory framework. Specifically, we design a novel task partitioning and scheduling mechanism to offload each part of LR-TDDFT to the most suitable computing units within a CPU-NDP system. Additionally, we implement a hardware/software co-optimization of a critical kernel in LR-TDDFT to further enhance performance on the CPU-NDP system. Our results show that NDFT achieves performance improvements of 5.2x and 2.5x over CPU and GPU baselines, respectively, on a large physical system.

Subjects: Hardware Architecture , Computational Physics

Publish: 2025-04-04 13:51:24 UTC