AAAI.2026 - Data Mining and Knowledge Management | Cool Papers

#1 Beyond Static: Related Questions Retrieval Through Conversations in Community Question Answering [PDF] [Copy] [Kimi] [REL]

Authors: Xiao Ao, Jie Zou, Yibiao Wei, Peng Wang, Weikang Guo

In community question answering (cQA) platforms like Stack Overflow, related question retrieval is recognized as a fundamental task that allows users to retrieve related questions to answer user queries automatically. Although many traditional approaches have been proposed for investigating this research field, they mostly rely on static approaches and neglect the interaction property. We argue that the conversational way can well distinguish the fine-grained representations of questions and has great potential to improve the performance of question retrieval. In this paper, we propose a related question retrieval model through conversations, called TeCQR, to locate related questions in cQA. Specifically, we build conversations by utilizing tag-enhanced clarifying questions. In addition, we design a noise tolerance model that evaluates the semantic similarity between questions and tags, enabling the model to effectively handle noisy feedback. Moreover, the tag-enhanced two-stage offline training is proposed to fully exploit the mutual relationships among user queries, questions, and tags to learn their fine-grained representations. Based on the learned representations and contextual conversations, TeCQR incorporates conversational feedback by learning to ask tag-enhanced clarifying questions to retrieve related questions more effectively. Experimental results demonstrate that our model significantly outperforms state-of-the-art baselines.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#2 Extracting Interaction-Aware Monosemantic Concepts in Recommender Systems [PDF] [Copy] [Kimi] [REL]

Authors: Dor Arviv, Yehonatan Elisha, Oren Barkan, Noam Koenigstein

We present a method for extracting monosemantic neurons, defined as latent dimensions that align with coherent and interpretable concepts, from user and item embeddings in recommender systems. Our approach employs a Sparse Autoencoder (SAE) to reveal semantic structure within pretrained representations. In contrast to work on language models, monosemanticity in recommendation must preserve the interactions between separate user and item embeddings. To achieve this, we introduce a prediction aware training objective that backpropagates through a frozen recommender and aligns the learned latent structure with the model’s user-item affinity predictions. The resulting neurons capture properties such as genre, popularity, and temporal trends, and support post hoc control operations including targeted filtering and content promotion without modifying the base model. Our method generalizes across different recommendation models and datasets, providing a practical tool for interpretable and controllable personalization.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#3 Brownian Bridge Augmented Surrogate Simulation and Injection Planning for Geological CO2 Storage [PDF] [Copy] [Kimi] [REL]

Authors: Haoyue Bai, Guodong Chen, Wangyang Ying, Xinyuan Wang, Nanxu Gong, Sixun Dong, Giulia Pedrielli, Haoyu Wang, Haifeng Chen, Yanjie Fu

Geological CO2 storage (GCS) involves injecting captured CO2 into deep subsurface formations to support climate goals. The effective management of GCS relies on adaptive injection planning to dynamically control injection rates and well pressures to balance both storage safety and efficiency. Prior literature, including numerical optimization methods and surrogate-optimization methods, is limited by real-world GCS requirements of smooth state transitions and goal-directed planning within limited time. To address these limitations, we propose a Brownian Bridge–augmented framework for surrogate simulation and injection planning in GCS and develop two insights (i) Brownian bridge as smooth state regularizer for better surrogate simulator; (ii) Brownian bridge as goal-time-conditioned planning guidance for better injection planning. Our method has three stages: (i) learning deep Brownian bridge representations with contrastive and reconstructive losses from historical reservoir and utility trajectories, (ii) incorporating Brownian bridge-based next state interpolation for simulator regularization (iii) guiding injection planning with Brownian utility-conditioned trajectories to generate high-quality injection plans. Experimental results across multiple datasets collected from diverse GCS settings demonstrate that our framework consistently improves simulation fidelity and planning effectiveness while maintaining low computational overhead.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#4 NP-MiSR: Neural Process-based Multi-Interest Learning for Session-Based Recommendation [PDF] [Copy] [Kimi] [REL]

Authors: Jun Bao, Junbo Wang, Yiheng Jiang, Xiangfeng Liu, Mingyang Lv, Yuanbo Xu

Session-based recommendation (SBR) aims to provide users with satisfactory suggestions via modeling preferences based on short-term, anonymous user-item interaction sequences. Traditional single interest learning methods struggle to align with the diverse nature of preferences. Recent advances resolved this bottleneck by learning multiple interest embeddings for each session. However, due to the pre-defining scheme of interest quantity (e.g. the number of interests), these approaches are deficient in adaptive ability towards distinctive preference patterns across different users. Moreover, these methods rely solely on the current session and ignore useful information from related ones. The short-term property of sessions would magnify the insufficient representation issue. To address these limitations, we propose a Neural Process-based Multi-interest learning framework for Session-based Recommendation, namely NP-MiSR. To be specific, our method enables adaptive multi-interest representation learning through two complementary mechanisms: 1) Neural Process-based Intra-session interest modeling: We employ Neural Processes to model the distribution of interests within a session, where the fixed interest configurations are no longer needed. 2) Cross-session context fusion: We extract interest distributions of similar sessions as contextual priors to refine the current session’s interest representation. Extensive experiments on three datasets demonstrate that our method consistently outperforms state-of-the-art SBR approaches with an average improvement of 38.8%. Moreover, the few-shot learning task reveals that NP-MiSR achieves a surprisingly favorable efficiency v.s. performance trade-off where utilizing only 10% of the training data attains 95% of the recommendation performance.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#5 DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression [PDF] [Copy] [Kimi] [REL]

Authors: Youneng Bao, Yulong Cheng, Yiping Liu, Yichen Yang, Peng Qin, Mu Li, Yongsheng Liang

Prevailing quantization techniques in Learned Image Compression (LIC) typically employ a static, uniform bit-width across all layers, failing to adapt to the highly diverse data distributions and sensitivity characteristics inherent in LIC models. This leads to a suboptimal trade-off between performance and efficiency. In this paper, we introduce DynaQuant, a novel framework for dynamic mixed-precision quantization that operates on two complementary levels. First, we propose content-aware quantization, where learnable scaling and offset parameters dynamically adapt to the statistical variations of latent features. This fine-grained adaptation is trained end-to-end using a novel Distance-aware Gradient Modulator (DGM), which provides a more informative learning signal than the standard Straight-Through Estimator. Second, we introduce a data-driven, dynamic bit-width selector that learns to assign an optimal bit precision to each layer, dynamically reconfiguring the network's precision profile based on the input data. Our fully dynamic approach offers substantial flexibility in balancing rate-distortion (R-D) performance and computational cost. Experiments demonstrate that DynaQuant achieves R-D performance comparable to full-precision models while significantly reducing computational and storage requirements, thereby enabling the practical deployment of advanced LIC on diverse hardware platforms.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#6 Fidelity-Aware Recommendation Explanations via Stochastic Path Integration [PDF] [Copy] [Kimi] [REL]

Authors: Oren Barkan, Yahlly Schein, Yehonatan Elisha, Veronika Bogina, Mikhail Baklanov, Noam Koenigstein

Explanation fidelity, which measures how accurately an explanation reflects a model’s true reasoning, remains critically underexplored in recommender systems. We introduce SPINRec (Stochastic Path Integration for Neural Recommender Explanations), a model-agnostic approach that adapts path-integration techniques to the sparse and implicit nature of recommendation data. To overcome the limitations of prior methods, SPINRec employs stochastic baseline sampling: instead of integrating from a fixed or unrealistic baseline, it samples multiple plausible user profiles from the empirical data distribution and selects the most faithful attribution path. This design captures the influence of both observed and unobserved interactions, yielding more stable and personalized explanations. We conduct the most comprehensive fidelity evaluation to date across three models (MF, VAE, NCF), three datasets (ML1M, Yahoo! Music, Pinterest), and a suite of counterfactual metrics, including AUC-based perturbation curves and fixed-length diagnostics. SPINRec consistently outperforms all baselines, establishing a new benchmark for faithful explainability in recommendation.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#7 F2RVLM: Boosting Fine-grained Fragment Retrieval for Multi-Modal Long-form Dialogue with Vision Language Model [PDF] [Copy] [Kimi] [REL]

Authors: Hanbo Bi, Zhiqiang Yuan, Zexi Jia, Jiapei Zhang, Chongyang Li, Peixiang Luo, Ying Deng, Xiaoyue Duan, Jinchao Zhang

Traditional dialogue retrieval aims to select the most appropriate utterance or image from recent dialogue history. However, they often fail to meet users’ actual needs for revisiting semantically coherent content scattered across long-form conversations. To fill this gap, we define the Fine-grained Fragment Retrieval (FFR) task, requiring models to locate query-relevant fragments, comprising both utterances and images, from multimodal long-form dialogues. As a foundation for FFR, we construct MLDR, the longest-turn multimodal dialogue retrieval dataset to date, averaging 25.45 turns per dialogue, with each naturally spanning three distinct topics. To evaluate generalization in real-world scenarios, we curate and annotate a WeChat-based test set comprising real-world multimodal dialogues with an average of 75.38 turns. Building on these resources, we explore existing generation-based Vision-Language Models (VLMs) on FFR and observe that they often retrieve incoherent utterance-image fragments. While optimized for generating responses from visual-textual inputs, these models lack explicit supervision to ensure semantic coherence within retrieved fragments. To address this, we propose F2RVLM, a generative retrieval model trained in a two-stage paradigm: (1) supervised fine-tuning to inject fragment-level retrieval knowledge, and (2) GRPO-based reinforcement learning with multi-objective rewards to encourage outputs with semantic precision, relevance, and contextual coherence. In addition, to account for difficulty variations arising from differences in intra-fragment element distribution, ranging from locally dense to sparsely scattered, we introduce a difficulty-aware curriculum sampling that ranks training instances by predicted difficulty and gradually incorporates harder examples. This strategy enhances the model’s reasoning ability in long-form, multi-turn dialogue contexts. Experiments on both in-domain and real-domain sets demonstrate that F2RVLM substantially outperforms popular VLMs, achieving superior retrieval performance.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#8 HCF: Hierarchical Cascade Framework for Distributed Multi-Stage Image Compression [PDF] [Copy] [Kimi] [REL]

Authors: Junhao Cai, Taegun An, Chengjun Jin, Sung Il Choi, Juhyun Park, Changhee Joo

Distributed multi-stage image compression—where visual content traverses multiple processing nodes under varying quality requirements—poses challenges. Progressive methods enable bitstream truncation but underutilize available compute resources; successive compression repeats costly pixel-domain operations and suffers cumulative quality loss and inefficiency; fixed-parameter models lack post-encoding flexibility. In this work, we developed the Hierarchical Cascade Framework (HCF) that achieves high rate-distortion performance and better computational efficiency through direct latent-space transformations across network nodes in distributed multi-stage image compression systems. Under HCF, we introduced policy-driven quantization control to optimize rate–distortion trade-offs, and established the edge quantization principle through differential entropy analysis. The configuration based on this principle demonstrates up to 0.6dB PSNR gains over other configurations. When comprehensively evaluated on the Kodak, CLIC, and CLIC2020-mobile datasets, HCF outperforms successive-compression methods by up to 5.56% BD-Rate in PSNR on CLIC, while saving up to 97.8% FLOPs, 96.5% GPU memory, and 90.0% execution time. It also outperforms state-of-the-art progressive compression methods by up to 12.64% BD-Rate on Kodak and enables retraining-free cross-quality adaptation with 7.13-10.87% BD-Rate reductions on CLIC2020-mobile.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#9 Augmenting Intra-Modal Understanding in MLLMs for Robust Multimodal Keyphrase Generation [PDF] [Copy] [Kimi] [REL]

Authors: Jiajun Cao, Qinggang Zhang, Yunbo Tang, Zhishang Xiang, Chang Yang, Jinsong Su

Multimodal keyphrase generation (MKP) aims to extract a concise set of keyphrases that capture the essential meaning of paired image–text inputs, enabling structured understanding, indexing, and retrieval of multimedia data across the web and social platforms. Success in this task demands effectively bridging the semantic gap between heterogeneous modalities. While multimodal large language models (MLLMs) achieve superior cross-modal understanding by leveraging massive pretraining on image-text corpora, we observe that they often struggle with modality bias and fine-grained intra-modal feature extraction. This oversight leads to a lack of robustness in real-world scenarios where multimedia data is noisy, along with incomplete or misaligned modalities. To address this problem, we propose AimKP, a novel framework that explicitly reinforces intra-modal semantic learning in MLLMs while preserving cross-modal alignment. AimKP incorporates two core innovations: (i) Progressive Modality Masking, which forces fine-grained feature extraction from corrupted inputs by progressively masking modality information during training; (ii) Gradient-based Filtering, that identifies and discards noisy samples, preventing them from corrupting the model’s core cross-modal learning. Extensive experiments validate AimKP’s effectiveness in multimodal keyphrase generation and its robustness across different scenarios.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#10 VBF++: Variational Bayesian Fusion with Context-Aware Priors and Recommendation-Guided Adversarial Refinement for Multimodal Video Recommendation [PDF] [Copy] [Kimi] [REL]

Authors: Ziyi Cao, Rui Liu, Yong Chen

Multimodal video recommendation systems face fundamental challenges in determining optimal fusion strategies across diverse content types and user preferences. Existing methods suffer from two critical limitations: (1) their fusion strategies are guided by context-agnostic priors that ignore the semantic structure of content, assuming the same simple distribution (typically a standard multivariate Gaussian prior) governs optimal fusion for all video types, and (2) their optimization objectives, particularly the Evidence Lower Bound (ELBO), are misaligned with the final recommendation goal, optimizing for feature reconstruction rather than ranking performance. To address these fundamental issues, this work proposes VBF++, a novel framework that introduces context-aware structured priors and recommendation-guided adversarial refinement. First, the method designs context-aware priors that learn cluster-specific distributions based on video semantic categories, replacing uninformative priors with structured, content-aware prior distributions. Second, it introduces a Recommendation-Guided Adversarial Refinement (RAR) paradigm that explicitly steers the learning process towards generating recommendation-optimal fusion strategies, resolving the objective misalignment inherent in variational learning. Enhanced with domain-adaptive meta-learning, extensive experiments on three real-world datasets demonstrate consistent improvements of 4.7-8.3 percent in Precision@10 over state-of-the-art methods. Analysis reveals that learned fusion strategies exhibit semantically meaningful patterns, prioritizing visual features for action content, acoustic information for music videos, and textual descriptions for documentary material.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#11 Learning to Compress Graphs via Dual Agents for Consistent Topological Robustness Evaluation [PDF] [Copy] [Kimi] [REL]

Authors: Qisen Chai, Yansong Wang, Junjie Huang, Tao Jia

As graph-structured data grow increasingly large, evaluating their robustness under adversarial attacks becomes computationally expensive and difficult to scale. To address this challenge, we propose to compress graphs into compact representations that preserve both topological structure and robustness profile, enabling efficient and reliable evaluation.We propose Cutter, a dual-agent reinforcement learning framework composed of a Vital Detection Agent (VDA) and a Redundancy Detection Agent (RDA), which collaboratively identify structurally vital and redundant nodes for guided compression. Cutter incorporates three key strategies to enhance learning efficiency and compression quality: trajectory-level reward shaping to transform sparse trajectory returns into dense, policy-equivalent learning signals; prototype-based shaping to guide decisions using behavioral patterns from both highand low-return trajectories; and cross-agent imitation to enable safer and more transferable exploration. Experiments on multiple real-world graphs demonstrate that Cutter generates compressed graphs that retain essential static topological properties and exhibit robustness degradation trends highly consistent with the original graphs under various attack scenarios, thereby significantly improving evaluation efficiency without compromising assessment fidelity.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#12 MISF: MLLM Guided Iterative Sample Filtering for Data Fault Detection [PDF] [Copy] [Kimi] [REL]

Authors: Guoying Chen, Ruizhuo Zhao, Zhewei Xu, Bo Yang, Kunlong Wang

High quality datasets are critical for training reliable machine learning models, yet data faults caused by insufficient annotation expertise or malicious poisoning attacks remain prevalent. Traditional classifier based methods rely on manually curated subsets for fault detection, but their limited scale frequently leads to model overfitting. While multimodal large language models (MLLMs) based methods offer promising detection capabilities, their few-shot learning limitations hinder generalization in domain specific tasks. To address these challenges, we propose MLLM Guided Iterative Sample Filtering (MISF), a novel framework that combines the strengths of MLLM based initialization and iterative data refinement. Our framework initializes the detection model with MLLM generated synthetic images and a curated clean subset, then iteratively refines it by progressively selecting high certainty clean samples, improving both domain adaptation and detection accuracy. Extensive experiments on RESISC45 and Oxford-IIIT Pets datasets demonstrate that MISF effectively identifies data faults, outperforming existing approaches. MISF provides a robust, scalable solution for improving dataset quality in specialized domains.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#13 Breaking the Aggregation Bottleneck in Federated Recommendation: A Personalized Model Merging Approach [PDF] [Copy] [Kimi] [REL]

Authors: Jundong Chen, Honglei Zhang, Chunxu Zhang, Fangyuan Luo, Yidong Li

Federated recommendation (FR) facilitates collaborative training by aggregating local models from massive devices, enabling client-specific personalization while ensuring privacy. However, we empirically and theoretically demonstrate that server-side aggregation can undermine client-side personalization, leading to suboptimal performance, i.e., the aggregation bottleneck. This issue stems from the inherent heterogeneity across numerous clients in FR, which drives the global model to deviate from local optima. To this end, we propose FedEM, which elastically merges the global and local models to compensate for impaired personalization. Unlike existing personalized federated recommendation (pFR) methods, FedEM (1) investigates the aggregation bottleneck in FR through theoretical insights, rather than relying on heuristic analysis; (2) leverages off-the-shelf local models rather than designing additional mechanisms to boost personalization. Extensive experiments demonstrate that our method preserves client personalization during collaborative training, outperforming state-of-the-art baselines.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#14 Diffusion Reconstruction-based Data Likelihood Estimation for Core-Set Selection [PDF] [Copy] [Kimi] [REL]

Authors: Mingyang Chen, Jiawei Du, Bo Huang, Yi Wang, Xiaobo Zhang, Wei Wang

Existing core-set selection methods predominantly rely on heuristic scoring signals such as training dynamics or model uncertainty, lacking explicit modeling of data likelihood. This omission may hinder the constructed subset from capturing subtle yet critical distributional structures that underpin effective model training. In this work, we propose a novel, theoretically grounded approach that leverages diffusion models to estimate data likelihood via reconstruction deviation induced by partial reverse denoising. Specifically, we establish a formal connection between reconstruction error and data likelihood, grounded in the Evidence Lower Bound (ELBO) of Markovian diffusion processes, thereby enabling a principled, distribution-aware scoring criterion for data selection. Complementarily, we introduce an efficient information-theoretic method to identify the optimal reconstruction timestep, ensuring that the deviation provides a reliable signal indicative of underlying data likelihood. Extensive experiments on ImageNet demonstrate that reconstruction deviation offers an effective scoring criterion, consistently outperforming existing baselines across selection ratios, and closely matching full-data training using only 50% of the data. Further analysis shows that the likelihood-informed nature of our score reveals informative insights in data selection, shedding light on the interplay between data distributional characteristics and model learning preferences.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#15 Transform-Free Feature Coding via Entropy-Constrained Vector Quantization [PDF] [Copy] [Kimi] [REL]

Authors: Qiaoxi Chen, Changsheng Gao, Li Li, Dong Liu

Feature coding has recently emerged as a key technique for efficient transmission of intermediate representations in distributed AI systems. Existing approaches largely follow a transform-based pipeline inherited from image and video coding, where the transform module is used to remove spatial structural redundancies in visual signals. However, our analysis indicates that such redundancies have already been largely removed during feature extraction, which reduces the necessity of the transform module. Building on this insight, we propose a new transform-free pipeline that directly encodes the extracted features via a vector quantization module and an entropy model. The proposed transform‑free framework jointly learns the quantization codebook and entropy model, enabling end‑to‑end optimization tailored to the inherent feature characteristics. Furthermore, the proposed method inherently avoids the computational complexity of the transform module. Experiments on features from diverse architectures and tasks demonstrate that our method achieves superior rate-distortion performance compared to transform-based baselines, while significantly reducing the encoding and decoding complexity.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#16 ProRec-Video: Guiding Hierarchical Interest Transitions for Proactive Short Video Recommendation with Dynamic Feedback Adaptation [PDF] [Copy] [Kimi] [REL]

Authors: Weizhi Chen, Baoyun Peng, Bo Liu, Xingkong Ma, Houjie Qiu

Traditional short video recommendations primarily enhance user retention by reinforcing existing user preferences, potentially leading to information cocoons. Conversely, proactive recommendations aim to diversify user interests by exposing users to content beyond their historical preferences. However, current proactive approaches face three limitations: (1) homogeneous receptivity assumption, neglecting individual differences in users' openness to new interests; (2) short-term item exposure without interest anchoring, focusing on item-level shifts rather than interest evolution; and (3) static feedback utilization, failing to incorporate dynamic user feedback during the recommendation adequately. To address these challenges, we propose ProRec-Video, a proactive framework that guides hierarchical interest transitions through three innovations. First, User Receptivity Profiling assesses individual openness for new interests, ensuring personalized transition pacing. Second, Hierarchical Interest Transition Planning decomposes complex interest shifts into intermediate steps to generate smooth interest transition paths and semantically coherent video sequences, addressing overemphasis on item exposure. Third, Dynamic Feedback Adaptation integrates agent-based simulation and Reflexion mechanisms to refine interest transition paths and video sequences based on real-time user feedback, enhancing adaptability and satisfaction. Extensive experiments on two datasets demonstrate that ProRec-Video achieves a significant improvement in proactive recommendation performance, with an interest transition success rate of 85% and a user satisfaction rate of 78.3%.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#17 Dual-Kernel Graph Community Contrastive Learning [PDF] [Copy] [Kimi] [REL]

Authors: Xiang Chen, Kun Yue, Wenjie Liu, Zhenyu Zhang, Liang Duan

Graph Contrastive Learning (GCL) has emerged as a powerful paradigm for training Graph Neural Networks (GNNs) in the absence of task-specific labels. However, its scalability on large-scale graphs is hindered by the intensive message passing mechanism of GNN and the quadratic computational complexity of contrastive loss over positive and negative node pairs. To address these issues, we propose an efficient GCL framework that transforms the input graph into a compact network of interconnected node sets while preserving structural information across communities. We firstly introduce a kernelized graph community contrastive loss with linear complexity, enabling effective information transfer among node sets to capture hierarchical structural information of the graph. We then incorporate a knowledge distillation technique into the decoupled GNN architecture to accelerate inference while maintaining strong generalization performance. Extensive experiments on sixteen real-world datasets of varying scales demonstrate that our method outperforms state-of-the-art GCL baselines in both effectiveness and scalability.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#18 ARDiff: Anisotropic Residual Diffusion for Heterogeneous Graph Learning [PDF] [Copy] [Kimi] [REL]

Authors: Yong Chen, Li Li, Nannan Zong, Zhihui Liu, Song-Zhi Su

Learning representations on graphs is foundational for many downstream tasks, and its synergy with diffusion models has emerged as a promising direction. However, diffusion-based methods for heterogeneous graphs remain underexplored, confronting two principal challenges: (1) The presence of noise and structural heterogeneity in graphs makes it challenging to accurately capture semantic transitions among diverse relation types. (2) The isotropic Gaussian noise used in forward diffusion fails to reflect graphs' inherent semantics and structural anisotropy. To address these, we propose ARDiff, a novel framework that integrates residual diffusion with anisotropic noise for heterogeneous graph learning. Specifically, we propose a semantic residual diffusion mechanism that progressively refines node embeddings by orchestrating transitions from low-semantic (high-noise) to high-semantic (low-noise) relational contexts, thus enabling step-wise distillation of task-relevant information. In addition, to address the limitations of conventional diffusion, we introduce an anisotropic diffusion strategy: in the forward process, noise injection is oriented by structural and semantic priors; in the denoising step, a conditional diffusion mechanism is guided by a random walk encoding, enhancing both topological consistency and semantic alignment. Extensive evaluation on heterogeneous graph datasets demonstrates that ARDiff significantly surpasses current leading methods in link prediction and node classification, setting a new paradigm and benchmark in heterogeneous graph representation learning.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#19 TOPOGRAPH: Topology-Preserving Graph Reduction with Adaptive Structure for Persistent Homology [PDF] [Copy] [Kimi] [REL]

Authors: Zonghao Chen, Yuncheng Jiang, Gang Li

Topological Data Analysis (TDA) provides artificial intelligence (AI) systems with mathematically rigorous geometric descriptors through Persistent Homology (PH), capturing essential shape characteristics in high-dimensional data. Yet, PH’s combinatorial complexity and sensitivity to outliers hinder its scalability and reliability, especially for Intrinsic PH (IPH) that relies on accurate geodesic distances. While stateof-the-art landmark-based subsampling methods, PH Landmarks, ameliorate computational costs and improve outlier robustness by selecting representative points based on local PH scores, it remain computationally intensive and at low sampling rates struggle to reconstruct the global topology. In this work, we introduce TOPOGRAPH, a simple yet powerful framework that preserves intrinsic topology. The resulting coarsened graph supports efficient IPH computations using Fermat distances. Experiments on both synthetic and realworld datasets show that TOPOGRAPH outperforms stateof-the-art sampling-based methods by achieving an order-ofmagnitude speedup and substantially improved topological fidelity in persistence diagrams, demonstrating its ability for robust and scalable topological data analysis.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#20 GraphRAG-Induced Dual Knowledge Structure Graphs for Personalized Learning Path Recommendation [PDF] [Copy] [Kimi] [REL]

Authors: Xinghe Cheng, Zihan Zhang, Jiapu Wang, Liangda Fang, Chaobo He, Quanlong Guan, Shirui Pan, Weiqi Luo

Learning path recommendation seeks to provide students with a structured sequence of learning items (e.g., knowledge concepts or exercises) to optimize their learning efficiency. Despite significant efforts in this area, most existing methods primarily rely on prerequisite relations, which present two major limitations: (1) Prerequisite relations between knowledge concepts are difficult to obtain due to the cost of expert annotation, hindering the application of current learning path recommendation methods. (2) Relying on a single sequentially dependent knowledge structure based on prerequisite relations implies that a confusing knowledge concept can disrupt subsequent learning processes, which is referred to as blocked learning. To address these two challenges, we propose a novel approach, GraphRAG-Induced Dual Knowledge Structure Graphs for Personalized Learning Path Recommendation (KnowLP), which enhances learning path recommendations by incorporating both prerequisite and similarity relations between knowledge concepts. Specifically, we introduce a knowledge structure graph generation module EDU-GraphRAG that constructs knowledge structure graphs for different educational datasets, significantly improving the applicability of learning path recommendation methods. We then propose a Discrimination Learning-driven Reinforcement Learning (DLRL) module that utilizes similarity relations as fallback relations when prerequisite relations become ineffective, thereby alleviating the blocked learning. Finally, we conduct extensive experiments on three benchmark datasets, demonstrating that our method not only achieves state-of-the-art performance but also generates more effective and longer learning paths.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#21 Reinforced Rate Control for Neural Video Compression via Inter-Frame Rate–Distortion Awareness [PDF] [Copy] [Kimi] [REL]

Authors: Wuyang Cong, Junqi Shi, Lizhong Wang, Weijing Shi, Ming Lu, Hao Chen, Zhan Ma

Neural video compression (NVC) has demonstrated superior compression efficiency, yet effective rate control remains a significant challenge due to complex temporal dependencies. Existing rate control schemes typically leverage frame content to capture distortion interactions, overlooking inter-frame rate dependencies arising from shifts in per-frame coding parameters. This often leads to suboptimal bitrate allocation and cascading parameter decisions. To address this, we propose a reinforcement‑learning (RL)‑based rate control framework that formulates the task as a frame‑by‑frame sequential decision process. At each frame, an RL agent observes a spatiotemporal state and selects coding parameters to optimize a long‑term reward that reflects rate‑distortion (R-D) performance and bitrate adherence. Unlike prior methods, our approach jointly determines bitrate allocation and coding configuration in a single step, independent of group‑of‑pictures (GOP) structure. Extensive experiments across diverse NVC architectures show that our method reduces the average relative bitrate error to 1.20 percent and achieves up to 13.45 percent bitrate savings at typical GOP sizes, outperforming existing approaches. In addition, our framework demonstrates improved robustness to content variation and bandwidth fluctuations with lower encoding/decoding overhead, making it highly suitable for practical deployment.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#22 De-collapsing User Intent: Adaptive Diffusion Augmentation with Mixture-of-Experts for Sequential Recommendation [PDF] [Copy] [Kimi] [REL]

Authors: Xiaoxi Cui, Chao Zhao, Yurong Cheng, Xiangmin Zhou

Sequential recommendation (SR) aims to predict users' next action based on their historical behavior, and is widely adopted by a number of platforms. The performance of SR models relies on rich interaction data. However, in real-world scenarios, many users only have a few historical interactions, leading to the problem of data sparsity. Data sparsity not only leads to model overfitting on sparse sequences, but also hinders the model’s ability to capture the underlying hierarchy of user intents. This results in misinterpreting the user's true intents and recommending irrelevant items. Existing data augmentation methods attempt to mitigate overfitting by generating relevant and varied data. However, they overlook the problem of reconstructing the user's intent hierarchy, which is lost in sparse data. Consequently, the augmented data often fails to align with the user's true intents, potentially leading to misguided recommendations. To address this, we propose the Adaptive Diffusion Augmentation for Recommendation (ADARec) framework. Critically, instead of using a diffusion model as a black-box generator, we use its entire step-wise denoising trajectory to reconstruct a user's intent hierarchy from a single sparse sequence. To ensure both efficiency and effectiveness, our framework adaptively determines the required augmentation depth for each sequence and employs a specialized mixture-of-experts architecture to decouple coarse- and fine-grained intents. Experiments show ADARec outperforms state-of-the-art methods on standard benchmarks and on sparse sequences, demonstrating its ability to reconstruct hierarchical intent representations from sparse data.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#23 Intermediate N-Gramming: Deterministic and Fast N-Grams for Large N and Large Datasets [PDF] [Copy] [Kimi] [REL]

Authors: Ryan R. Curtin, Fred Lu, Edward Raff, Priyanka Ranade

The number of n-gram features grows exponentially in n, making it computationally demanding to compute the most frequent n-grams even for n as small as 3. Motivated by our production machine learning system built on n-gram features, we ask: is it possible to accurately, deterministically, and quickly recover the top-k most frequent n-grams? We devise a multi-pass algorithm called Intergrams that constructs candidate n-grams from the preceding (n-1)-grams. By designing this algorithm with hardware in mind, our approach yields more than an order of magnitude speedup (up to 33x!) over the next known fastest algorithm, even when similar optimization are applied to the other algorithm. Using the empirical power-law distribution over n-grams, we also provide theory to inform the efficacy of our multi-pass approach.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#24 Delayed Feedback Modeling with Influence Functions [PDF] [Copy] [Kimi] [REL]

Authors: Chenlu Ding, Jiancan Wu, Yancheng Yuan, Cunchun Li, Xiang Wang, Dingxian Wang, Frank Yang, Andrew Rabinovich

In online advertising under the cost-per-conversion (CPA) model, accurate conversion rate (CVR) prediction is crucial. A major challenge is delayed feedback, where conversions may occur long after user interactions, leading to incomplete recent data and biased model training. Existing solutions partially mitigate this issue but often rely on auxiliary models, making them computationally inefficient and less adaptive to user interest shifts. We propose IF-DFM, an Influence Function-empowered for Delayed Feedback Modeling which estimates the impact of newly arrived and delayed conversions on model parameters, enabling efficient updates without full retraining. By reformulating the inverse Hessian-vector-product as an optimization problem, IF-DFM achieves a favorable trade-off between scalability and effectiveness. Experiments on benchmark datasets show that IF-DFM outperforms prior methods in both accuracy and adaptability.

Subject: AAAI.2026 - Data Mining and Knowledge Management

#25 ARNS: Adaptive Relation-Aware Negative Sampling with Curriculum Learning for Inductive Knowledge Graph Completion [PDF] [Copy] [Kimi] [REL]

Authors: Ling Ding, Zhizhi Yu, Di Jin, Lei Huang

Inductive knowledge graph completion (KGC) aims to predict missing links involving unseen entities, making it a particularly challenging task for knowledge representation learning. Traditional embedding-based methods often fall short in this setting due to their limited structural reasoning capabilities. Recently, Graph Neural Networks (GNNs) offer a promising alternative by explicitly modeling the graph topology. However, their performance heavily relies on the quality of negative samples during training, which significantly influences the learned representations and generalization ability. To tackle this issue, we propose Adaptive Relation-Aware Negative Sampling (ARNS), a negative sampling approach specifically tailored for GNN-based inductive KGC. It integrates three key strategies: (1) High-quality negatives via Linear WD for discriminative learning, (2) Relation-aware negatives utilizing relation graphs to preserve structural patterns, as well as (3) Adaptive curriculum learning that dynamically adjusts sampling ratios based on performance feedback. Our key innovation lies in a performance-driven adaptation mechanism that monitors training dynamics and modulates negative sample difficulty. This approach starts with easier samples for stability, and progressively introduces challenging negatives. Experiments demonstrate that ARNS outperforms state-of-the-art methods with significant MRR improvements while maintaining training stability. The adaptive design is particularly beneficial in inductive scenarios, where models can infer structural patterns from limited observations.

Subject: AAAI.2026 - Data Mining and Knowledge Management