DIA-CLIP: a universal representation learning framework for zero-shot DIA proteomics

#1 DIA-CLIP: a universal representation learning framework for zero-shot DIA proteomics [PDF] [Copy] [Kimi] [REL]

Authors: Yucheng Liao, Han Wen, Weinan E, Weijie Zhang

Data-independent acquisition mass spectrometry (DIA-MS) has established itself as a cornerstone of proteomic profiling and large-scale systems biology, offering unparalleled depth and reproducibility. Current DIA analysis frameworks, however, require semi-supervised training within each run for peptide-spectrum match (PSM) re-scoring. This approach is prone to overfitting and lacks generalizability across diverse species and experimental conditions. Here, we present DIA-CLIP, a pre-trained model shifting the DIA analysis paradigm from semi-supervised training to universal cross-modal representation learning. By integrating dual-encoder contrastive learning framework with encoder-decoder architecture, DIA-CLIP establishes a unified cross-modal representation for peptides and corresponding spectral features, achieving high-precision, zero-shot PSM inference. Extensive evaluations across diverse benchmarks demonstrate that DIA-CLIP consistently outperforms state-of-the-art tools, yielding up to a 45% increase in protein identification while achieving a 12% reduction in entrapment identifications. Moreover, DIA-CLIP holds immense potential for diverse practical applications, such as single-cell and spatial proteomics, where its enhanced identification depth facilitates the discovery of novel biomarkers and the elucidates of intricate cellular mechanisms.

Subjects: Machine Learning , Artificial Intelligence , Quantitative Methods

Publish: 2026-02-02 07:55:24 UTC

2602.01772

#1 DIA-CLIP: a universal representation learning framework for zero-shot DIA proteomics [PDF] [Copy] [Kimi] [REL]