ProTransformer: Robustify Transformers via Plug-and-Play Paradigm

#1 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm [PDF²] [Copy] [Kimi] [REL]

Authors: Zhichao Hou, Weizhi Gao, Yuchen Shen, Feiyi Wang, Xiaorui Liu

Transformer-based architectures have dominated various areas of machine learning in recent years. In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures. Crucially, this technique can be integrated into existing transformers as a plug-and-play layer, improving their robustness without the need for additional training or fine-tuning. Through comprehensive experiments and ablation studies, we demonstrate that our ProTransformer significantly enhances the robustness of transformer models across a variety of prediction tasks, attack mechanisms, backbone architectures, and data domains. Notably, without further fine-tuning, the ProTransformer consistently improves the performance of vanilla transformers by 19.5%, 28.3%, 16.1%, and 11.4% for BERT, ALBERT, DistilBERT, and RoBERTa, respectively, under the classical TextFooler attack. Furthermore, ProTransformer shows promising resilience in large language models (LLMs) against prompting-based attacks, improving the performance of T5 and LLaMA by 24.8% and 17.8%, respectively, and enhancing Vicuna by an average of 10.4% against the Jailbreaking attack. Beyond the language domain, ProTransformer also demonstrates outstanding robustness in both vision and graph domains.

Subjects: Machine Learning , Computation and Language , Cryptography and Security

Publish: 2024-10-30 16:38:09 UTC

2410.23182

#1 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm [PDF2] [Copy] [Kimi] [REL]

#1 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm [PDF²] [Copy] [Kimi] [REL]