FigEx: Aligned Extraction of Scientific Figures and Captions

2025.findings-emnlp.899@ACL

Total: 1

#1 FigEx: Aligned Extraction of Scientific Figures and Captions [PDF] [Copy] [Kimi] [REL]

Authors: Jifeng Song, Arun Das, Ge Cui, Yufei Huang

Automatic understanding of figures in scientific papers is challenging since they often contain subfigures and subcaptions in complex layouts. In this paper, we propose FigEx, a vision-language model to extract aligned pairs of subfigures and subcaptions from scientific papers. We also release BioSci-Fig, a curated dataset of 7,174 compound figures with annotated subfigure bounding boxes and aligned subcaptions. On BioSci-Fig, FigEx improves subfigure detection APb over Grounding DINO by 0.023 and boosts caption separation BLEU over Llama-2-13B by 0.465. The source code is available at: https://github.com/Huang-AI4Medicine-Lab/FigEx.

Subject: EMNLP.2025 - Findings