Independence Tests for Language Models

#1 Independence Tests for Language Models [PDF⁴] [Copy] [Kimi³] [REL]

Authors: Sally Zhu, Ahmed Ahmed, Rohith Kuditipudi, Percy Liang

Motivated by liability and intellectual property concerns over open-weight models we consider the following problem: given the weights of two models, can we test whether they were trained independently---i.e., from independent random initializations? We consider two settings: *constrained* and *unconstrained*. In the constrained setting, we make assumptions about model architecture and training and propose statistical tests that yield exact p-values with respect to the null hypothesis that the models are trained from independent random initializations. We compute the p-values by simulating exchangeable copies of each model under our assumptions and comparing various similarity measures between the original two models versus these copies. We report p-values on pairs of 21 open-weight models (210 total pairs) and find we correctly identify all pairs of non-independent models. In the unconstrained setting we make none of the prior assumptions and allow for adversarial evasion attacks that do not change model output. We thus propose a new test which matches hidden activations between two models, which is robust to these transformations and to changes in model architecture and can also identify specific non-independent components of models. Though we no longer obtain exact p-values from this test, empirically we find it reliably distinguishes non-independent models like a p-value. Notably, we can use the test to identify specific parts of one model that are derived from another (e.g., how Llama 3.1-8B was pruned to initialize Llama 3.2-3B, or shared layers between Mistral-7B and StripedHyena-7B), and it is even robust to retraining individual layers of either model from scratch.

Subject: ICML.2025 - Spotlight

mzSwYvwYdC@OpenReview

#1 Independence Tests for Language Models [PDF4] [Copy] [Kimi3] [REL]

#1 Independence Tests for Language Models [PDF⁴] [Copy] [Kimi³] [REL]