SoK: Reconstruction Attacks on Synthetic Tabular Data (Insights from Winning the NIST CRC)

#1 SoK: Reconstruction Attacks on Synthetic Tabular Data (Insights from Winning the NIST CRC) [PDF] [Copy] [Kimi] [REL]

Authors: Steven Golob, Sikha Pentyala, Martine De Cock

Synthetic data is increasingly promoted as a privacy-preserving substitute for releasing sensitive tabular records, yet its central adversarial threat ("reconstruction", the recovery of an individual's hidden attribute values from a synthetic release and a handful of known quasi-identifiers) has been studied only in scattered, hard-to-compare settings. We present the first systematization of reconstruction (equivalently, attribute inference) attacks on de-identified and synthetic tabular data. We contribute a taxonomy that organizes attacks by the structure they exploit; the most systematic empirical evaluation to date, pitting fourteen attacks against nine synthetic data generation (SDG) methods across five benchmark datasets; and a set of new attacks that fill gaps in the taxonomy, one of which (CoBP-RA) is the strongest attack we measure. Crucially, we introduce a methodology for interpreting what attack success means: a memorization test that distinguishes reconstruction of the population distribution from memorization of training records, and a reduction that places reconstruction and membership inference on a single comparable scale. Our findings: the choice of SDG method governs risk far more than the choice of attack; differential privacy protects mainly at small budgets ($\varepsilon\lesssim1$), above which protection plateaus, bounded by the synthesizer's capacity rather than its noise; de-identification methods are the most exposed; and most reconstruction reflects distributional structure rather than memorization, concentrating individual risk on atypical records. The attacks and infrastructure are externally validated by our first-place finish among all red teams in the 2025 \textit{National Institute of Standards and Technology} (NIST) Collaborative Research Cycle.

Subjects: Cryptography and Security , Machine Learning

Publish: 2026-06-06 23:39:45 UTC

2606.08372

#1 SoK: Reconstruction Attacks on Synthetic Tabular Data (Insights from Winning the NIST CRC) [PDF] [Copy] [Kimi] [REL]