Total: 1
High Performance Computing (HPC) on hybrid clusters represents a significant opportunity for Computational Fluid Dynamics (CFD), especially when modern accelerators are utilized effectively. However, despite the widespread adoption of GPUs, programmability remains a challenge, particularly in open-source contexts. In this paper, we present SPUMA, a full GPU porting of OPENFOAM targeting NVIDIA and AMD GPUs. The implementation strategy is based on a portable programming model and the adoption of a memory pool manager that leverages the unified memory feature of modern GPUs. This approach is discussed alongside several numerical tests conducted on two pre-exascale clusters in Europe, LUMI and Leonardo, which host AMD MI250X and NVIDIA A100 GPUs, respectively. In the performance analysis section, we present results related to memory usage profiling and kernel wall-time, the impact of the memory pool, and energy consumption obtained by simulating the well-known DrivAer industrial test case. GPU utilization strongly affects strong scalability results, reaching 65% efficiency on both LUMI and Leonardo when approaching a load of 8 million cells per GPU. Weak scalability results, obtained on 20 GPUs with the OpenFOAM native multigrid solver, range from 75% on Leonardo to 85% on LUMI. Notably, efficiency is no lower than 90% when switching to the NVIDIA AmgX linear algebra solver. Our tests also reveal that one A100 GPU on Leonardo is equivalent 200-300 Intel Sapphire Rapids cores, provided the GPUs are sufficiently oversubscribed (more than 10 million of cells per GPU). Finally, energy consumption is reduced by up to 82% compared to analogous simulations executed on CPUs.