o2y6BS6mm0@OpenReview

Total: 1

#1 Don’t Forget the Enjoin: FocalLoRA for Instruction Hierarchical Alignment in Large Language Models [PDF] [Copy] [Kimi] [REL]

Authors: Zitong Shi, Guancheng Wan, Haixin Wang, Ruoyan Li, Zijie Huang, Wanjia Zhao, Yijia Xiao, Xiao Luo, Carl Yang, Yizhou Sun, Wei Wang

Recent studies reveal that large language models (LLMs) often struggle to resolve conflicting instructions embedded within hierarchical prompts, resulting in decreased compliance with system-level directives and compromising the reliability of safety-critical applications. While earlier approaches attempt to improve instruction hierarchy awareness through prompt engineering or embedding-level modifications, they typically lack structural modeling and either offer limited gains or require extensive fine-tuning. In this work, we introduce $\textbf{FocalLoRA}$, a parameter-efficient and structure-aware framework that strengthens hierarchical instruction adherence by selectively optimizing structurally critical attention heads, referred to as $\textit{focal heads}$, which exhibit heightened sensitivity to instruction conflicts. Experiments across multiple models and a dedicated benchmark demonstrate that FocalLoRA markedly enhances system instruction compliance with minimal tuning cost. For instance, on Llama-8B, fine-tuning only 0.0188\% of parameters yields a 35.52\% $\uparrow$ in system instruction compliance.

Subject: NeurIPS.2025 - Poster