Probability Distribution Collapse: A Critical Bottleneck to Compact Unsupervised Neural Grammar Induction

2025.emnlp-main.1694@ACL

Total: 1

#1 Probability Distribution Collapse: A Critical Bottleneck to Compact Unsupervised Neural Grammar Induction [PDF] [Copy] [Kimi] [REL]

Unsupervised neural grammar induction aims to learn interpretable hierarchical structures from language data. However, existing models face an expressiveness bottleneck, often resulting in unnecessarily large yet underperforming grammars. We identify a core issue, *probability distribution collapse*, as the underlying cause of this limitation. We analyze when and how the collapse emerges across key components of neural parameterization and introduce a targeted solution, *collapse-relaxing neural parameterization*, to mitigate it. Our approach substantially improves parsing performance while enabling the use of significantly more compact grammars across a wide range of languages, as demonstrated through extensive empirical analysis.

Subject: EMNLP.2025 - Main