Cross-MoE: An Efficient Temporal Prediction Framework Integrating Textual Modality

#1 Cross-MoE: An Efficient Temporal Prediction Framework Integrating Textual Modality [PDF] [Copy] [Kimi] [REL]

Authors: Ruizheng Huang, Zhicheng Zhang, Yong Wang

It has been demonstrated that incorporating external information as textual modality can effectively improve time series forecasting accuracy. However, current multi-modal models ignore the dynamic and different relations between time series patterns and textual features, which leads to poor performance in temporal-textual feature fusion. In this paper, we propose a lightweight and model-agnostic temporal-textual fusion framework named Cross-MoE. It replaces Cross Attention with Cross-Ranker to reduce computational complexity, and enhances modality-aware correlation memorization with Mixture-of-Experts (MoE) networks to tolerate the distributional shifts in time series. The experimental results demonstrate a 8.78% average reduction in Mean Squared Error (MSE) compared to the SOTA multi-modal time series framework. Notably, our method requires only 75% of computational overhead and 12.5% of activated parameters compared with Cross Attention mechanism. Our codes are available at https://github.com/Kilosigh/Cross-MoE.git

Subject: EMNLP.2025 - Main

2025.emnlp-main.1520@ACL

#1 Cross-MoE: An Efficient Temporal Prediction Framework Integrating Textual Modality [PDF] [Copy] [Kimi] [REL]