AAAI.2026 - Student Abstract and Poster Program | Cool Papers

#1 LaFINet: Laplacian-Based Frequency Injection Network for Camouflage Object Detection (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Aravinthakshan A S, Aditya Prashant Naidu, Aadiv Rath

Camouflaged object detection is critical for military, defense, and security operations, where targets evade conventional surveillance by mimicking the background or exhibiting low-contrast differences. It also supports non-invasive monitoring of elusive wildlife and endangered species, improving population estimates, habitat management, and biodiversity assessments by recovering objects that are visually indistinguishable from their surroundings. Existing solutions are computationally heavy, with large model parameters and high computational demands, which hinder deployment in real-world applications. Lightweight models have been explored, but they often compromise fine boundary fidelity. This paper introduces a lightweight Laplacian pyramid–based feature extractor that progressively aggregates multiscale Laplacian features with frequency information. The proposed architecture emphasizes object edge boundaries, enabling precise localization under subtle target–background differences while maintaining realtime efficiency. The design achieves performance comparable to the state of the art (SOTA) convolution based methods on CHAMELEON and NC4K datasets.

Subject: AAAI.2026 - Student Abstract and Poster Program

#2 Object-Centric Data Synthesis for Category-level Object Detection (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Vikhyat Agarwal, Jiayi Cora Guo, Declan Hoban, Sissi Zhang, Nicholas Moran, Peter Cho, Srilakshmi Pattabiraman, Shantanu Joshi

Deep learning approaches to object detection have achieved reliable detection of specific object classes in images. However, extending a model’s detection capability to new object classes requires large amounts of annotated training data, which is costly and time-consuming to acquire, especially for long-tailed classes with insufficient representation in existing datasets. We compare four distinct methods of generating synthetic data to finetune object detection models on novel object categories, particularly when limited data is available in an object-centric format (multi-view images/3D models). Our approaches are based on simple image processing techniques, 3D rendering, and image generation models, each varying in complexity and realism. We assess how our methods, which use object-centric data to synthesize realistic, cluttered images with varying contextual coherence, enable models to achieve category-level generalization in real-world data. We demonstrate significant performance boosts within this data-constrained experimental setting.

Subject: AAAI.2026 - Student Abstract and Poster Program

#3 AniTales: End-to-End Multimodal Story Generation Through Natural Language Prompting (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Mrigendra Agrawal, Yunze Xiao

We present AniTales, a system designed to generate multimodal visual novels from natural language prompts. Our system integrates large language models for story generation, diffusion models for character art, and text-to-speech for voice acting. This paper describes the system's architecture and presents findings from a pilot user study. We evaluated the system with general users (n=10) and domain experts (n=5), focusing on usability, coherence, and visual consistency. General users reported high usability (SUS: 84/100) and strong character-dialogue consistency (4.2/5), along with an average score of 82/100 for their intention to continue using the platform. These initial results suggest AniTales is a promising approach for bridging the gap between text-based AI storytelling and end-to-end multimedia content creation.

Subject: AAAI.2026 - Student Abstract and Poster Program

#4 Adaptive AI for Personalized Intercultural Communication Education: A Conversational Agent Powered by Retrieval-Augmented Generation (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Mohamed Ahmed, Hyesun Choung

Traditional intercultural communication training often lacks safe spaces for open practice, leading to self-censorship and limited skill development. The ICC Tutor, an AI-powered conversational system, addresses this by offering a private, nonjudgmental environment for reflection and dialog. Using retrieval-augmented generation (RAG), the system grounds its prompts and feedback in course materials. We conducted a mixed-methods study (N = 25) with Beginner/Intermediate and expert learners. Preliminary findings suggest that the tutor helped reduce feelings of nervousness. While many beginners reported increased confidence in intercultural communication, expert learners’ confidence temporarily decreased, suggesting the AI’s role in fostering deeper self-reflection rather than just boosting perceived competence. These findings underscore the potential of AI tutors in supporting communication education and highlight the need for experience-adaptive designs to support nuanced learning trajectories.

Subject: AAAI.2026 - Student Abstract and Poster Program

#5 Diffusion for Combating the Hallucination in Large Language Models (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Hyojun Ahn, Joongheon Kim

Large language models (LLMs) often generate hallucinations—fluent yet factually incorrect responses—that undermine reliability in knowledge-intensive tasks. Existing approaches for hallucination mitigation typically rely on external retrieval modules or probability heuristics, which either require additional resources or lack interpretability. In this work, we propose a diffusion-based hallucination detection framework (DHDF) that leverages U-Net denoising to reconstruct consensus answers from multiple LLM outputs. If the diffusion process exhibits spurious convergence away from factual ground truth, it provides a clear signal of hallucination. To quantify factual correctness, we incorporate TruthfulQA scores as a fact-grounded evaluation metric, distinguishing well-aligned models (high scores) from hallucination-prone models (low scores). Experimental results demonstrate that convergence dynamics under diffusion, combined with fact-grounded QA evaluation, offer an effective and interpretable pathway for hallucination detection without relying on external knowledge bases.

Subject: AAAI.2026 - Student Abstract and Poster Program

#6 Robust Adaptive Multi-Step Predictive Shielding (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Tanmay Ambadkar, Darshan Chudiwal, Greg Anderson, Abhinav Verma

Ensuring safety in deep reinforcement learning is challenging, as formal methods that provide strong guarantees often fail to scale to complex, high-dimensional systems. We introduce RAMPS, a scalable shielding framework that pairs a general-purpose, learned linear dynamics model with a robust, multi-step Control Barrier Function (CBF) for real-time safety interventions. Experiments show RAMPS significantly reduces safety violations in high-dimensional environments compared to state-of-the-art methods, without sacrificing task performance.

Subject: AAAI.2026 - Student Abstract and Poster Program

#7 Dynamics-Aware Planning Representation for Zero-Shot Reinforcement Learning (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Jungho An, Taeyoung Kim, Haeun Kim, Dongsoo Har

Offline Zero-Shot Reinforcement Learning requires an agent to solve unseen tasks using only a fixed offline dataset without explicit rewards. A central challenge is learning representations that capture both high-level long-term planning and low-level physical dynamics. We propose a novel framework, Dynamics-Aware Planning Representation (DAPR), which disentangles these two aspects via complementary contrastive objectives. Specifically, DAPR learns goal-oriented planning directions and local dynamics-consistent directions in the latent space. By jointly enforcing these constraints, DAPR yields representations that balance “where to go” with “how to move.” Experiments on standard locomotion benchmarks (Walker, Cheetah, Quadruped) demonstrate that DAPR consistently improves performance and generalization over strong baselines, achieving substantial gains on precision demanding tasks.

Subject: AAAI.2026 - Student Abstract and Poster Program

#8 Weight Entropy-Maximised Evidential Metamodel for Uncertainty Quantification (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Gouranga Bala, Abhimanyu Chauhan, Amit Sethi

Reliable uncertainty quantification (UQ) is crucial for deploying deep learning models in safety-critical domains. Existing UQ methods often either rely on multi-pass inference, which increases computational cost, or restrict expressiveness by using only final-layer embeddings. In this work, we propose a lightweight evidential meta-model that leverages multi-layer feature fusion from a pretrained backbone, capturing both low-level features and high-level semantics to better estimate uncertainty. To further enhance epistemic fidelity, we integrate maximum weight-entropy (Max-WEnt) regularization, which encourages hypothesis diversity without altering the base network or adding test-time overhead. Experiments across two benchmark settings, medical (BACH, HAM10000, BreakHIS, DIV2K) and natural (ImageNet, SVHN, Fashion-MNIST, ImageNet-C) datasets, demonstrate consistent improvements in AUROC of out-of-distribution detection compared to prior post-hoc UQ methods. Our findings show that combining multi-layer evidential modeling with Max-WEnt provides a robust, efficient, and practical framework for trustworthy AI in high-stakes applications. The meta-model adds only ~0.8M parameters and trains in under four hours on a single 48GB GPU, making it practical for real-world deployment.

Subject: AAAI.2026 - Student Abstract and Poster Program

#9 Fusing Time-Domain and Constellation Views: A Multimodal MAE for Wireless Signals (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Agniva Banerjee, Arijit Sen

This paper introduces a multi-modal masked autoencoder (MMAE) that jointly denoises and classifies signals by fusing time-domain IQ sequences and constellation diagrams within a cross-attentive transformer. This approach treats noise as a learnable modality to enhance robustness, a dynamic masking curriculum combined with domain regularization training and a hybrid loss function to promote domain-invariant features. Experimentation on the RadioML 2018.01A and RadioML22 datasets demonstrates superior accuracy across different SNR levels while using substantially less labeled data than state-of-the-art approaches.

Subject: AAAI.2026 - Student Abstract and Poster Program

#10 Zero-Shot Vision Language Reasoning via Dual-layer Scene Graph Chain of Thoughts (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Yash Bansal, Parshiv Kapoor, Agam Pandey

Large Multimodal Models (LMMs) often hallucinate objects and struggle with compositional reasoning in complex visual scenes. Structured Scene Graph (SG) representations explicitly encoding objects, attributes, and relations can mitigate these issues, however finetuning risks catastrophic forgetting. Recent zero-shot approaches prompt LMMs with scene graphs, yet typically rely on a single SG generated in one step, limiting capture of holistic context and question-specific details. We introduce a Dual-Layer Scene Graph Chain-of-Thought DLSG-CoT framework that enriches reasoning by combining two structured SGs: a Global Scene Graph (G-SG) that offers comprehensive image context, and a Query-Specific Scene Graph (Q-SG) produced through a two-step process targeting information relevant to the input query. Extensive experiments demonstrate that DLSG-CoT substantially improves LMM performance on compositional and context-sensitive tasks.

Subject: AAAI.2026 - Student Abstract and Poster Program

#11 3D Gaussian Splatting for Reconstructing Large Sparse Environments (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Jonathan Boel Nielsen, Xuan Huy Pham, Erdal Kayacan, Andriy Sarabakha

3D Gaussian splatting (3DGS) has recently demonstrated significant potential in computer vision, enabling high-fidelity 3D scene reconstruction with real-time rendering and fast training times. However, existing methods struggle in large, visually sparse, geometric self-similarity environments due to heavy reliance on image-based feature matching and depth information. In this work, we propose a novel reconstruction pipeline that reduces the dependence on visual features by incorporating IMU and LiDAR data to generate accurate point clouds and robustly localize images within the scene. Global colorization is achieved through 3D-to-2D projections of the localized images, which are then used to supervise 3DGS training. Our results demonstrate that the proposed pipeline significantly enhances the quality of 3D reconstruction for large, sparse scenarios, opening up new opportunities for applications in remote mapping and autonomous inspection.

Subject: AAAI.2026 - Student Abstract and Poster Program

#12 Network Restoration Games with Quotas (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Philip Bogaars, Argyrios Deligkas, Eduard Eiben, Michail Fasoulakis

In a game of Network Restoration Games With Quotas, there is an underlying graph where a subset of its edges have to be restored by a set of agents. Each agent has a creation cost for each such edge, a traversal cost for every edge of the graph, and in addition they have a quota on the number of edges they have to restore. Then, given a set of edges that fulfill the quota, the cost of an agent is the cost of creating these edges, plus the cost of reaching them, i.e., the traversal cost. We prove that any cost-minimizing allocation is swap-stable, i.e., there is no profitable exchange of edges between any pair of agents, but computing one is hard even on trees. We complement this by designing an algorithm that finds a swap-stable allocation on trees in polynomial time and we quantify its cost against the optimal one.

Subject: AAAI.2026 - Student Abstract and Poster Program

#13 Always Refuse: Steering LLMs Against Jailbreaks with Contrastive Activations (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Abhilekh Borah, Niranjan Chebrolu, Kokil Jaidka

Refusals must be resilient, not brittle.” Yet guarding refusals against adversarial phrasing and shifting user contexts remains difficult: large language models (LLMs) still yield to jailbreak prompts that evade safety filters and surface harmful content. We propose Refusal Activation Steering (RAS), a training-free, inference-time method that uses contrastive activations to shift LLM responses, biasing generation trajectories toward refusals without altering model weights. The approach is modular and domain-targetable, avoiding collateral refusals on benign queries while strengthening activation- space boundaries for unsafe content. On adversarial evaluations with an 8B instruction-tuned model, we find that steering improves refusal rate by ∼ 52% and reduces attack success rate by ∼ 40%, establishing a lightweight and interpretable safety layer for robust refusal consistency. To foster further research in this domain, we have made our implementation publicly available.

Subject: AAAI.2026 - Student Abstract and Poster Program

#14 PT-DCFR: Accelerating and Improving Deep CFR Using Population Based Training (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Dingzhong Cai, Huale Li, Hang Xiao, Shuhan Qi, Jiajia Zhang

Deep CFR enables end-to-end approximation of Nash equilibria in imperfect-information games(IIGs) but is sensitive to hyperparameters, making manual tuning inefficient. To address this, we propose PT-DCFR, which integrates Population-Based Training(PBT) with Deep CFR to dynamically optimize hyperparameters during training. Building upon this, we further introduce P2T-DCFR, which decouples parameter selection from model performance.

Subject: AAAI.2026 - Student Abstract and Poster Program

#15 AdaptDiff: Adaptive Guidance in Diffusion Models for Diverse and Identity-Consistent Face Synthesis (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Eduarda Caldeira, Tahar Chettaoui, Naser Damer, Fadi Boutros

Diffusion models conditioned on identity embeddings enable the generation of synthetic face images that consistently preserve identity across multiple samples. Recent work has shown that introducing an additional negative condition through classifier-free guidance during sampling provides a mechanism to suppress undesired attributes, thus improving inter-class separability. Building on this insight, we propose a dynamic weighting scheme for the negative condition that adapts throughout the sampling trajectory. This strategy leverages the complementary strengths of positive and negative conditions at different stages of generation, leading to more diverse yet identity-consistent synthetic data.

Subject: AAAI.2026 - Student Abstract and Poster Program

#16 Bi-Level Preference Optimization for Retrieval-Augmented Generation (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Author: Sizhong Cao

Retrieval-augmented generation (RAG) is the backbone of knowledge-intensive NLP, yet its progress is hindered by a long-standing asymmetry: Generators are refined while retrievers remain static, and full end-to-end optimization is prohibitively unstable. We present BPO-RAG, a bi-level preference-learning framework that redefines the training paradigm by jointly optimizing retrieval and generation with a single supervision signal, pairwise preferences. Stage~1 (Retrieval Preference Optimization) learns to select superior evidence sets, while Stage~2 (Generation Preference Optimization) aligns answer generation with the same evidence, closing the gap between what to read and what to write. This recipe without label requires no reward model or online RL, integrates seamlessly with standard RAG pipelines, and transforms preferences into a unifying training currency. Across open-domain QA benchmarks, BPO-RAG consistently advances retrieval quality and yields more accurate, faithful answers, surpassing strong RAG baselines with remarkable stability. By coupling retrieval and generation under a unified preference framework, BPO-RAG establishes a practical and principled path toward the next generation of reliable, modular, and trustworthy knowledge-intensive language models.

Subject: AAAI.2026 - Student Abstract and Poster Program

#17 Discovering Linear Non-Gaussian Models for All Categories of Missing Data (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Matteo Ceriscioli, Shohei Shimizu, Karthika Mohan

Causal discovery is the task of learning causal models, encoding causal relationships, from a source of information, such as a dataset containing observational data. While many algorithms have been developed to discover causal models under varied sets of assumptions, the case in which the dataset is affected by missing data remains significantly underexplored. Naively applying standard causal discovery algorithms to listwise, test-wise, or regression-wise deleted datasets, or imputing the missing data, can introduce spurious associations between variables and bias function estimation in functional causal models. This issue arises when the data is missing at random or not at random. It ultimately invalidates the theoretical guarantees of these algorithms and prevents finding the true underlying causal model, even in the large-sample limit. An established family of causal models is the Linear Non-Gaussian Acyclic Model (LiNGAM), which assumes linear functional relationships and non-Gaussian independent noise terms. We propose a new causal discovery algorithm for LiNGAM, capable of recovering the underlying causal structure and providing unbiased estimates of the model’s parameters, even when the data is affected by MNAR missingness.

Subject: AAAI.2026 - Student Abstract and Poster Program

#18 WingBeats and Snapshots: Fusing Sound and Vision for Mosquito Monitoring (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Ahana Chanda, Akshay Agarwal

Accurate identification of mosquito species is crucial for controlling vector-borne diseases, yet visual or acoustic methods alone are often insufficient. We propose a multimodal deep-learning framework that combines high-resolution images with wingbeat audio using a SwinV2 vision transformer and an Audio Spectrogram Transformer, thereby capturing complementary cues. On a six-species dataset, it achieves 97% accuracy, comparable to the best single-modality baseline, and is designed to improve robustness under noise or environmental variation, demonstrating the value of integrating multiple data sources for reliable mosquito surveillance.

Subject: AAAI.2026 - Student Abstract and Poster Program

#19 AEFGL: Reverse Auction and Value Evaluation-Based Federated Graph Learning Incentive Mechanism (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Xin Chang, Lixin Liu, Jingyu Wang, Jinling Yu, Xiaolin Zhang

Federated Graph Learning enables multiple clients to collaboratively train graph models while protecting local private data. However, most studies have assumed that all clients contribute data voluntarily and actively. Without reasonable incentives, clients are often reluctant to contribute personal data for model training. Furthermore, the budget for incentives is limited, and if clients with low-quality graph data are incentivized to participate in training, it will negatively impact the training performance of all parties in the system. To address this, we propose AEFGL, a Reverse Auction and Value Evaluation-Based Incentive Mechanism for Federated Graph Learning. First, we design a reverse auction mechanism combining graph structural attribute motifs with client production value. Then, we propose a method for evaluating client production value based on the comparison of the client's expected reward and actual value. This mechanism can incentivize clients with high-quality graph data to participate in training within budget constraints, thereby improving the model quality. Experimental results validate the superiority of the AEFGL mechanism and the economic properties it satisfies.

Subject: AAAI.2026 - Student Abstract and Poster Program

#20 IMPACT: Integrated Multimodal Pipeline for Rapid Accident Causality Tracking (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Vashu Chauhan, Avinash Anand, Manisha Luthra, Uelison Jean Lopes dos Santos, Carsten Binnig, Rajiv Ratn Shah

Traffic accidents pose a significant societal challenge, with many fatalities being avoidable through timely emergency response. We introduce IMPACT (Integrated Multimodal Pipeline for Rapid Accident Causality Tracking), a scalable AI framework designed for autonomous, rapid traffic incident analysis using existing urban CCTV infrastructure. IMPACT combines a low-latency CPU-based vision module for real-time key-frame filtering (24 FPS) with the causal reasoning capabilities of MLLMs, reducing costly MLLM calls by over 92% compared to naive sparse sampling. We further present TRACE10K, a dataset featuring three-tier textual annotations that describe accident dynamics at the frame-sequence level.

Subject: AAAI.2026 - Student Abstract and Poster Program

#21 Sleep-Like Replay Reduces Loss-Landscape Sharpness to Improve Generalization (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Krishi Chawda, Jean Erik Delanois, Giri P Krishnan, Maksim Bazhenov

One of the central challenges in deep learning is that models trained on new tasks often overfit and lose the ability to generalize. This issue arises because gradient descent often converges to solutions in regions of the loss landscape that are sharp near their minima. High sharpness leads to rapid performance loss when test data are perturbed or statistically shifted. Although sharpness has been linked to generalization, few methods directly target it to improve generalization. Here we demonstrate that an unsupervised, sleep-like replay algorithm identifies low loss regions with lower sharpness leading to improvement in generalization to distortions, including Gaussian and salt-and-pepper noise. Our study identifies loss-function sharpness as a unifying measure for generalizable learning and robustness, and points to new principles for designing resilient AI systems.

Subject: AAAI.2026 - Student Abstract and Poster Program

#22 Multi-Modal Interactive Control of Robotic Arm Based on Offline Large Language Models (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Author: Hanxiao Chen

Large Language Models (LLMs) have revolutionized the modern society significantly with the numerous advanced interactions between humans and AI agents, whereas the usage of most large language models including ChatGPT are not friendly open-sourced and must require users paying a lot for such AI service continuously. Therefore, deploying open-sourced large language models on local servers can be considered as an efficient method to design and implement creative embodied AI algorithms with lower cost and more stable free usage. Inspired by this ordinary motivation, we originally propose and implement the “Socratic Models-ChatGLM”, which is a well-performed algorithm for multi-modal interactive control of robotic arm based on offline large language models via the facile PyBullet platform, even presents extraordinary potential to address complicated text-image multi-step long-horizon robotic manipulation tasks.

Subject: AAAI.2026 - Student Abstract and Poster Program

#23 Fine-Tuning Sample Order Matters in Propositional Logical Question-Answering (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Fengxiang Cheng, Chuan Zhou, Fenrong Liu, Robert van Rooij

Large language models (LLMs) have achieved impressive progress in natural language processing tasks but still struggle with complex logical reasoning. We observe that in propositional logic question-answering (QA), LLMs' performance varies with the order of training samples during fine-tuning. Motivated by this, we propose a data-driven approach to automatically determine the fine-tuning sample order, enhancing the logical QA performance of LLMs. Specifically, we first quantify the logical reasoning complexity of propositional reasoning samples and then stratify the training data into several subsets of ascending complexity. Subsequently, we fine-tune the LLMs on these subsets, progressing from low to high reasoning complexity. Experimental results demonstrate that our approach outperforms single-stage fine-tuning baselines across diverse reasoning benchmarks.

Subject: AAAI.2026 - Student Abstract and Poster Program

#24 Multimodal Coarse-to-Local Transformer for End-to-End Autonomous Driving (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Yeryeong Cho, Joongheon Kim

End-to-end (E2E) autonomous driving must maintain global consistency while preserving local precision. However, existing E2E approaches rarely achieve both goals simultaneously. Therefore, we propose a multimodal coarse-to-local transformer (MC2L-Transformer), which is composed of a hierarchical transformer architecture. Multimodal inputs are fused into a shared embedding, and global waypoints are produced. Local refinement is then utilized to capture fine interactions around the vehicle. Furthermore, a temporal encoder summarizes recent context, and navigation target and velocity are embedded to guide route- and speed-aware decoding. We evaluate in CARLA, and the results show lower collision and off-route rates even under sudden events. These results indicate that combining a coarse-to-local hierarchical transformer with a lightweight temporal context provides a practical step toward reliable E2E autonomous driving.

Subject: AAAI.2026 - Student Abstract and Poster Program

#25 Constraint-Augmented Mongolian-Chinese Neural Machine Translation Based on Dynamic Feedback Alignment (Student Abstract) [PDF] [Copy] [Kimi] [REL]

Authors: Shuting Dai, Yatu Ji, Yanli Wang, Lei Shi, Qing-Dao-Er-Ji Ren, Nier Wu, Na Liu

The scarcity of parallel corpora for Mongolian and Chinese constrains the performance of Mongolian-Chinese neural machine translation (NMT), particularly manifesting in inadequate accuracy in translating specialized terminology. To address this limitation, this study adopts a lexically constrained augmentation strategy that constructs pseudo-source sentences by appending Chinese constraint words to Mongolian source texts, while enforcing the inclusion of these constraints in the output to improve translation accuracy. However, this approach presents two inherent drawbacks: processing pseudo-sentences with a single encoder tends to induce semantic interference, while the introduced constraint words may exacerbate alignment errors during decoding. To overcome these limitations, this paper propose a Constraint-Augmented Mongolian-Chinese NMT method (CANMT) based on dynamic feedback alignment. The method employs a dual-encoder architecture to isolate bilingual representations, coupled with a dynamic feedback alignment module that progressively reduces alignment errors through iterative reffnement, thereby enhancing overall translation performance.

Subject: AAAI.2026 - Student Abstract and Poster Program