mOpNrrV2zH@OpenReview

Total: 1

#1 CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph [PDF5] [Copy] [Kimi5] [REL]

Authors: Haitao Lin, Guojiang Zhao, Odin Zhang, Yufei Huang, Lirong Wu, Cheng Tan, Zicheng Liu, Zhifeng Gao, Stan Z Li

Structure-based drug design (SBDD) aims to generate potential drugs that can bind to a target protein and is greatly expedited by the aid of AI techniques in generative models. However, a lack of systematic understanding persists due to the diverse settings, complex implementation, difficult reproducibility, and task singularity. Firstly, the absence of standardization can lead to unfair comparisons and inconclusive insights. To address this dilemma, we propose CBGBench, a comprehensive benchmark for SBDD, that unifies the task as a generative heterogeneous graph completion, analogous to fill-in-the-blank of the 3D complex binding graph. By categorizing existing methods based on their attributes, CBGBench facilitates a modular and extensible framework that implements various cutting-edge methods. Secondly, a single de novo molecule generation task can hardly reflect their capabilities. To broaden the scope, we adapt these models to a range of tasks essential in drug design, considered sub-tasks within the graph fill-in-the-blank tasks. These tasks include the generative designation of de novo molecules, linkers, fragments, scaffolds, and sidechains, all conditioned on the structures of protein pockets. Our evaluations are conducted with fairness, encompassing comprehensive perspectives on interaction, chemical properties, geometry authenticity, and substructure validity. We further provide deep insights with analysis from empirical studies. Our results indicate that there is potential for further improvements on many tasks, optimization in network architectures, and incorporation of chemical prior knowledge. To lower the barrier to entry and facilitate further developments in the field, we also provide a unified codebase (supplementary) that includes the discussed state-of-the-art models, data pre-processing, training, sampling, and evaluation.

Subject: ICLR.2025 - Spotlight