Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding

#1 Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding [PDF¹] [Copy] [Kimi] [REL]

Authors: Kaixiang Huang, Qifeng Zhang, Jin Wang, Jingru Yang, Yang Zhou, Huan Yu, Guodong Lu, Shengfeng He

3D Visual Grounding (3DVG) faces persistent challenges due to coarse scene-level observations and logically inconsistent annotations, which introduce ambiguities that compromise data quality and hinder effective model supervision. To address these challenges, we introduce Refer-Judge, a novel framework that harnesses the reasoning capabilities of Multimodal Large Language Models (MLLMs) to identify and mitigate toxic data. At the core of Refer-Judge is a Jury-and-Judge Chain-of-Thought paradigm, inspired by the deliberative process of the judicial system. This framework targets the root causes of annotation noise: jurors collaboratively assess 3DVG samples from diverse perspectives, providing structured, multi-faceted evaluations. Judges then consolidate these insights using a Corroborative Refinement strategy, which adaptively reorganizes information to correct ambiguities arising from biased or incomplete observations. Through this two-stage deliberation, Refer-Judge significantly enhances the reliability of data judgments. Extensive experiments demonstrate that our framework not only achieves human-level discrimination at the scene level but also improves the performance of baseline algorithms via data purification. Code is available at https://github.com/Hermione-HKX/Refer_Judge.

Subject: NeurIPS.2025 - Poster

gcAGeE8Cch@OpenReview

#1 Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding [PDF1] [Copy] [Kimi] [REL]

#1 Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding [PDF¹] [Copy] [Kimi] [REL]