Total: 1
The modeling of just noticeable difference (JND) in supervised learning for visual signals has made significant progress. However, existing JND models often suffer from limited generalization due to the need for large-scale training data and their constraints to certain image types. Moreover, these models primarily focus on a single RGB modality, ignoring the potential complementary impacts of multiple modalities. To address these challenges, we propose a new meta-learning approach for the JND modeling, called MetaJND. We introduce two key visual-sensitive modalities like saliency and depth, and leverage a self-attention mechanism for effective interdependence of multi-modal features. Additionally, we incorporate meta-learning for the modality alignment, facilitating dynamic weight generation. Furthermore, we perform hierarchical fusion through multi-layer channel and spatial feature rectification. Experimental results on four benchmark datasets demonstrate the effectiveness of our MetaJND. Moreover, we have also evaluated its performance in compression and watermarking applications, observing higher bit-rate savings and better watermark hiding capabilities.