Total: 1
It has been observed that deep neural networks (DNNs) often use both genuine as well as spurious features.In this work, we propose "Amending Inherent Interpretability via Self-Supervised Masking" (AIM), a simple yet surprisingly effective method that promotes the network's utilization of genuine features over spurious alternatives without requiring additional annotations.In particular, AIM uses features at multiple encoding stages to guide a self-supervised, sample-specific feature-masking process. As a result, AIM allows training well-performing and inherently interpretable models that faithfully summarize the decision process.When tested on challenging datasets designed to assess reliance on spurious features and out-of-domain generalization, AIM networks demonstrate significant dual benefits: Evaluations show that AIM improves interpretability, as measured by the Energy Pointing Game (EPG) score, by ~6-37%, while simultaneously enhancing accuracy by ~10-40%. These impressive performance gains are further validated on the standard in-domain CUB-200 dataset for fine-grained classification. The results provide compelling evidence supporting our hypothesis that AIM finds genuine and meaningful features that directly contribute to its improved human interpretability.