GEFA: A General Feature Attribution Framework Using Proxy Gradient Estimation

#1 GEFA: A General Feature Attribution Framework Using Proxy Gradient Estimation [PDF] [Copy] [Kimi¹] [REL]

Authors: Yi Cai, Thibaud Ardoin, Gerhard Wunder

Feature attribution explains machine decisions by quantifying each feature's contribution.While numerous approaches rely on exact gradient measurements, recent work has adopted gradient estimation to derive explanatory information under query-level access, a restrictive yet more practical accessibility assumption known as the black-box setting.Following this direction, this paper introduces GEFA (Gradient-estimation-based Explanation For All), a general feature attribution framework leveraging proxy gradient estimation.Unlike the previous attempt that focused on explaining image classifiers, the proposed explainer derives feature attributions in a proxy space, making it generally applicable to arbitrary black-box models, regardless of input type.In addition to its close relationship with Integrated Gradients, our approach, a path method built upon estimated gradients, surprisingly produces unbiased estimates of Shapley Values.Compared to traditional sampling-based Shapley Value estimators, GEFA avoids potential information waste sourced from computing marginal contributions, thereby improving explanation quality, as demonstrated in quantitative evaluations across various settings.

Subject: ICML.2025 - Poster

QyG0ilz5ju@OpenReview

#1 GEFA: A General Feature Attribution Framework Using Proxy Gradient Estimation [PDF] [Copy] [Kimi1] [REL]

#1 GEFA: A General Feature Attribution Framework Using Proxy Gradient Estimation [PDF] [Copy] [Kimi¹] [REL]