2512.10847

Total: 1

#1 Large Language Models for Superconductor Discovery [PDF] [Copy] [Kimi] [REL]

Authors: Suman Itani, Yibo Zhang, Ranjit Itani, Jiadong Zang

Large language models (LLMs) offer new opportunities for automated data extraction and property prediction across materials science, yet their use in superconductivity research remains limited. Here we construct a large experimental database of 78,203 records, covering 19,058 unique compositions, extracted from scientific literature using an LLM-driven workflow. Each entry includes chemical composition, critical temperature, measurement pressure, structural descriptors, and critical fields. We fine-tune several open-source LLMs for three tasks: (i) classifying superconductors vs. non-superconductors, (ii) predicting the superconducting transition temperature directly from composition or structure-informed inputs, and (iii) inverse design of candidate compositions conditioned on target Tc. The fine-tuned LLMs achieve performance comparable to traditional feature-based models and in some cases exceed them, while substantially outperforming their base versions and capturing meaningful chemical and structural trends. The inverse-design model generates chemically plausible compositions, including 28% novel candidates not seen in training. Finally, applying the trained predictors to the GNoME database identifies unreported materials with predicted Tc > 10 K. Although unverified, these candidates illustrate how integrating an LLM-driven workflow can enable scalable hypothesis generation for superconductivity discovery.

Subjects: Materials Science , Superconductivity

Publish: 2025-12-11 17:32:38 UTC