AAAI.2026 - Demonstration Track | Cool Papers

#1 DTECT: Dynamic Topic Explorer & Context Tracker [PDF] [Copy] [Kimi] [REL]

Authors: Suman Adhya, Debarshi Kumar Sanyal

To address the challenge of interpreting evolving themes in temporal text, we present DTECT (Dynamic Topic Explorer & Context Tracker), an interactive, end-to-end system for uncovering thematic dynamics. The system integrates a complete pipeline that supports data preprocessing, multiple model architectures, and dedicated metrics to analyze temporal topic quality. To enhance interpretability, DTECT features LLM-driven automatic topic labeling, trend analysis, interactive visualizations with document summarization, and a natural language chat interface. This cohesive platform empowers users to intuitively explore how topics change over time.

Subject: AAAI.2026 - Demonstration Track

#2 QueryGym: Step-by-Step Interaction with Relational Databases [PDF] [Copy] [Kimi] [REL]

Authors: Haritha Ananthakrishnan, Harsha Kokel, Kelsey Sikes, Debarun Bhattacharjya, Michael Katz, Shirin Sohrabi, Kavitha Srinivas

We introduce QueryGym, an interactive environment for building, testing, and evaluating LLM-based query planning agents. Existing frameworks often tie agents to specific query language dialects or obscure their reasoning; QueryGym instead requires agents to construct explicit sequences of relational algebra operations, ensuring engine-agnostic evaluation and transparent step-by-step planning. The environment is implemented as a Gymnasium interface that supplies observations---including schema details, intermediate results, and execution feedback---and receives actions that represent database exploration (e.g., previewing tables, sampling column values, retrieving unique values) as well as relational algebra operations (e.g., filter, project, join).We detail the motivation and the design of the environment. In the demo, we showcase the utility of the environment by contrasting it with contemporary LLMs that query databases. QueryGym serves as a practical testbed for research in error remediation, transparency, and reinforcement learning for query generation.

Subject: AAAI.2026 - Demonstration Track

#3 ARGUS: Towards End-to-End Argument Mining with Large Language Models [PDF] [Copy] [Kimi] [REL]

Authors: Ettore Caputo, Sergio Greco, Lucio La Cava

We present ARGUS, an end-to-end Argument Mining (AM) tool that exploits Large Language Models (LLMs) to automatically perform all core AM tasks, i.e., Argument Component Segmentation, Classification, Relation Identification, and Relation Classification. Furthermore, ARGUS builds the corresponding argumentation framework (AF) and seamlessly integrates symbolic solvers to compute extensions and perform formal reasoning. ARGUS is designed to ensure broad flexibility and usability, supporting any open-source or commercial LLMs and symbolic solvers, providing a ready-to-use platform for exploring neuro-symbolic approaches to argumentation in both research and practical applications.

Subject: AAAI.2026 - Demonstration Track

#4 InTimeAD: Interactive Time Series Anomaly Detection [PDF] [Copy] [Kimi] [REL]

Authors: Louis Carpentier, Wannes Meert, Mathias Verbeke

Time series anomaly detection has received substantial attention over the past two decades, leading to the development of hundreds of algorithms. However, comprehensively understanding this vast landscape remains challenging, particularly for non-experts and novices. In this demonstration paper, we present InTimeAD, an interactive web application that provides access to more than 30 state-of-the-art time series anomaly detection algorithms. InTimeAD is intended to explore the performance of existing as well as custom anomaly detection models in an interactive, hands-on manner. By lowering the entry bar, we support practitioners overwhelmed by the large number of existing techniques, while providing a platform for researchers to rapidly analyze their novel anomaly detection algorithms.

Subject: AAAI.2026 - Demonstration Track

#5 PAL: Personal Adaptive Learner [PDF] [Copy] [Kimi] [REL]

Authors: Megha Chakraborty, Darssan L. Eswaramoorthi, Madhur Thareja, Het Riteshkumar Shah, Finlay Palmer, Aryaman Bahl, Michelle A Ihetu, Amit Sheth

AI-driven education platforms have made some progress in personalisation, yet most remain constrained to static adaptation—predefined quizzes, uniform pacing, or generic feedback—limiting their ability to respond to learners’ evolving understanding. This shortfall highlights the need for systems that are both context-aware and adaptive in real time. We introduce PAL (Personal Adaptive Learner), an AI-powered platform that transforms lecture videos into interactive learning experiences. PAL continuously analyzes multimodal lecture content and dynamically engages learners through questions of varying difficulty, adjusting to their responses as the lesson unfolds. At the end of a session, PAL generates a personalized summary that reinforces key concepts while tailoring examples to the learner’s interests. By uniting multimodal content analysis with adaptive decision-making, PAL contributes a novel framework for responsive digital learning. Our work demonstrates how AI can move beyond static personalization toward real-time, individualized support, addressing a core challenge in AI-enabled education.

Subject: AAAI.2026 - Demonstration Track

#6 SHARE: Synthesizing Heterogeneous Autism-support Records into Evidence-based Recommendations [PDF] [Copy] [Kimi] [REL]

Authors: Saumya Chauhan, Mila Hong

Supporting children with Autism Spectrum Disorder (ASD) requires highly individualized knowledge. However, critical information is often dispersed across documents such as Individualized Education Plans (IEPs), diagnostic assessments, and caregiver notes. Thus, we propose SHARE (Synthesizing Heterogeneous Autism-support Records into Evidence-based Recommendations), a framework that combines diverse autism-related documents into a concise, actionable set of recommendations for caregivers of children with ASD. Feedback is generated using OpenAI’s large language model API, grounded in user-provided evidence with optional web-based extensions for missing details, and citation-linked. After caregivers attempt and then rate recommendations, SHARE uses a Bayesian bandit algorithm with Upper Confidence Bound (UCB) re-ranking to refine future advice. While previous work mostly focuses on drafting static goals, SHARE additionally combines LLM-generated recommendations, caregiver feedback, and interpretable ranking into a pipeline that can adapt over time.

Subject: AAAI.2026 - Demonstration Track

#7 SmartEyes: Plug-and-Play Event Detection for Retail Loss Prevention [PDF] [Copy] [Kimi] [REL]

Authors: Pi-Wei Chen, Jerry Chun-Wei Lin, Barış Fahri Kahrıman, Zih-Ching Chen, Rafał Cupek, Marek Drewniak

Event detection is essential for surveillance, particularly in retail loss prevention, where accurate and timely monitoring is critical. Vision Language Models (VLMs) provide strong generalization but are inefficient at processing full video streams and are prone to hallucinations induced by redundant frames. We present SmartEyes, a plug-and-play system for real-time retail surveillance. SmartEyes introduces the Perception Cognition Focusing (PCF) framework, which combines lightweight perception with semantic triggering to isolate two keyframes (customer contact and departure) and constrains the VLMs to a focused differencing task. This design reduces hallucination by 44% compared to vanilla VLMs. From the demonstrated retail application, the proposed perception-to-reasoning pipeline is general and directly extends to industrial environments that require reliable event detection and real-time decision-making. Our demo includes a user-friendly Region of Interest (ROI) selection interface and live CCTV monitoring, producing accurate alerts within 1–2 seconds on a single RTX 4080 GPU. This lightweight framework design enables efficient deployment to broader industrial applications.

Subject: AAAI.2026 - Demonstration Track

#8 PHOTONS: Pose-Free Human-Centric Photo-Realistic Real-Time Novel View Synthesis from Sparse Views [PDF] [Copy] [Kimi] [REL]

Authors: Yongyang Cheng, Boqin Qin, Zhao Hui, Xu Chen, Tao Zhang, Shang Sun, Haiquan Kang, Xiaojie Xu, Junwei Lv, Lei Yang, Xinyu Liu, Feng Jiang

We present PHOTONS (Pose-Free Human-Centric Photo-Realistic Real-Time Novel View Synthesis from Sparse Views), a real-time framework for novel view synthesis without requiring camera calibration. Our method reconstructs consistent 3D Gaussian point clouds and synthesizes 2K photo-realistic novel views from arbitrary numbers (>=2) of freely placed cameras. PHOTONS faithfully renders dynamic human bodies amid complex backgrounds, including interactive object manipulation and fine-grained details (e.g., hair strands), while maintaining 25 FPS throughput on commodity GPU like NVIDIA RTX 4090. By combining pose-free spatial point cloud reconstruction with Gaussian parameter estimation, our method demonstrates strong resilience to occlusions and camera perturbations. Additionally, we develop a 3D stereo system that drastically reduces setup complexity compared to existing solutions. Experiments on public and custom datasets show that PHOTONS outperforms state-of-the-art methods in both efficiency and visual quality.

Subject: AAAI.2026 - Demonstration Track

#9 Wikontic: A Tool for Building Knowledge Graphs from Text Aligned with the Wikidata Ontology [PDF] [Copy] [Kimi] [REL]

Authors: Alla Chepurova, Aydar Bulatov, Mikhail Burtsev, Yuri Kuratov

Knowledge Graphs (KGs) provide structured, verifiable representations that ground facts and supply large language models (LLMs) with reliable real-world information. Building high-quality KGs from open-domain text remains difficult due to redundancy, inconsistency, and lack of ontology grounding. We present Wikontic, a pipeline that extracts triples from text with LLMs and refines them through ontology-based typing, schema validation, and entity deduplication, yielding compact and coherent graphs. Unlike prior frameworks that lack ontology grounding or perform only partial deduplication, Wikontic uniquely integrates entity canonicalization, alias tracking, and automatic enforcement of Wikidata’s ontology, enabling robust schema-aware construction without manual schema design. Its web interface lets users upload text, visualize graphs, and perform multi-hop question answering. By combining LLM flexibility with Wikidata’s ontological rigor, Wikontic transforms ambiguous text into structured, interpretable, and actionable knowledge.

Subject: AAAI.2026 - Demonstration Track

#10 Docora: A System for Interactive Knowledge Extraction and Visualization from Scientific PDFs [PDF] [Copy] [Kimi] [REL]

Authors: Dinh-Truong Do, Hoang-An Trieu, Van-Thuy Phi, Le-Minh Nguyen, Yuji Matsumoto

Scientific research articles, typically distributed in PDF format, contain valuable knowledge but remain challenging to convert into structured datasets due to fragmented workflows that separate parsing, annotation, and visualization. Existing annotation platforms operate on plain text, which requires an additional PDF-to-text conversion step before annotation, while PDF parsing tools lack automated annotation suggestions. To bridge this gap, we introduce Docora, a system that unifies PDF parsing, automated annotation assistance, and multi-view visualization into a single interactive platform. Docora enables researchers to configure entity and relation schemas for any domain, automatically generates initial annotations using rule-based, model-based, or LLM-based extractors, and provides synchronized visualizations across PDF, text, and graph views. Users can refine annotations directly on the PDF canvas, ensuring consistency between document layout and structured representations. The system’s source code is publicly available to facilitate further research and development.

Subject: AAAI.2026 - Demonstration Track

#11 Traffic Signal Plans Explorer: A General Framework for Visualising Traffic Evolution [PDF] [Copy] [Kimi] [REL]

Authors: Francesco Doria, Francesco Percassi, Marco Maratea, Mauro Vallati

We present the Traffic Signal Plans Explorer, a framework for visualising and exploring traffic signal plans generated via PDDL+ planning. Designed to support both traffic experts and non-specialists, the tool offers a web-based interface for high-level network analysis and a SUMO-based adapter for detailed simulation. Users can inspect junction settings and link dynamics, and simulate plan execution step by step. The system bridges planning technology with practical traffic control, enhancing the transparency and usability of automatically generated solutions.

Subject: AAAI.2026 - Demonstration Track

#12 PAGER: Proactive Monitoring Agent for Enterprise AI Assistant [PDF] [Copy] [Kimi] [REL]

Authors: Sujan Dutta, Junior Francisco Garcia Ayala, Pranav Pujar, Sai Sree Harsha, Dan Luo, Nikhil Vasudeva, Bikas Saha, Pritom Baruah, Yunyao Li

We present a Proactive Monitoring Agent designed for large-scale customer data platforms, such as Adobe Experience Platform (AEP), to predict and prevent workflow disruptions before they impact business operations. Unlike existing reactive solutions that assist engineers only after failures occur, our agent anticipates potential failures across multiple workflow stages, explains its predictions in natural language, and interacts with customer support engineers through a conversational interface. The system integrates a machine learning-based Prediction Module, Knowledge Graph APIs for contextual data access, and a Query Processor that powers an interactive Q&A experience, enabling timely and actionable insights to minimize operational risks and maximize business continuity.

Subject: AAAI.2026 - Demonstration Track

#13 IntelliProof: An Argumentation Network-based Conversational Helper for Organized Reflection [PDF] [Copy] [Kimi] [REL]

Authors: Kaveh Eskandari Miandoab, Katharine Kowalyshyn, Kabir Pamnani, Anesu Gavhera, Vasanth Sarathy, Matthias Scheutz

We present IntelliProof, an interactive system for analyzing argumentative essays through LLMs. IntelliProof structures an essay as an argumentation graph, where claims are represented as nodes, supporting evidence is attached as node properties, and edges encode supporting or attacking relations. Unlike existing automated essay scoring systems, IntelliProof emphasizes the user experience: each relation is initially classified and scored by an LLM, then visualized for enhanced understanding. The system provides justifications for classifications and produces quantitative measures for essay coherence. It enables rapid exploration of argumentative quality while retaining human oversight. In addition, IntelliProof provides a set of tools for a better understanding of an argumentative essay and its corresponding graph in natural language, bridging the gap between the structural semantics of argumentative essays and the user's understanding of a given text.

Subject: AAAI.2026 - Demonstration Track

#14 Next-Generation Metalens Vision System: Powered by AI and Applied to AI [PDF] [Copy] [Kimi] [REL]

Authors: Fen Fang, Muli Yang, Henan Wang, Xinan Liang, Tobias Mass, Xuewu Xu, Xulei Yang, Zhengguo Li

Metalenses have been widely recognized as a key building block of next-generation optical systems, offering unprecedented advantages in compactness, lightweight design, and scalable manufacturing compared to traditional refractive optics. Despite this promise, practical use is limited by optical aberrations, blur, and illumination sensitivity, which degrade both visual quality and machine perception. In this demonstration, we present an end-to-end metalens vision system—from hardware sensing with a custom-built RGB metalens camera, to physics-informed imaging and real-time restoration, and finally to downstream vision applications such as object detection and depth estimation. By integrating spatially-aware attention enhancement and reinforcement learning-based illumination control into a real-time system, our solution transforms degraded raw captures into high-fidelity images that are both visually interpretable and functionally reliable for machine vision. This AI-powered pipeline highlights metalenses as a cornerstone for next-generation imaging, where advances in optics and machine intelligence jointly drive the future of visual perception.

Subject: AAAI.2026 - Demonstration Track

#15 DFAgent: From Natural Language Data Interactions to Reusable Agent-Ready Tools [PDF] [Copy] [Kimi] [REL]

Authors: Neelamadhav Gantayat, Renuka Sindhgatta, Sambit Ghosh, Sameep Mehta, Soujanya Soni

We present DataFoundry Agent (DFAgent), a system that forges reusable, agent-ready tools from interactive data exploration, quality, and remediation tasks. Users engage with data through natural-language prompts for operations that include inspection, transformation, and visualization. These interactions automatically generate executable code snippets that are logged. From these snippets, DFAgent acts as a foundry, synthesizing a governed catalog of enriched tools exposed via the Model Context Protocol (MCP). In this way, user-derived logic for all data operations is transformed into standardized, composable tools without reimplementation. We demonstrate how diverse interactions accumulate into a reusable toolset, highlighting a paradigm that unifies natural language interaction, executable code generation, and tool foundry processes for agentic data systems.

Subject: AAAI.2026 - Demonstration Track

#16 In-Situ Eval: A Modular Framework for Custom and Real-Time RAG Benchmarking [PDF] [Copy] [Kimi] [REL]

Authors: Ritvik Garimella, Kaushik Roy, Chathurangi Shyalika, Amit Sheth

Retrieval-Augmented Generation (RAG) has become the standard approach for integrating domain knowledge into Large Language Models (LLMs). However, fair comparison of RAG pipelines remains difficult: data preparation is often ad hoc, subsampling methods are opaque, parameters vary across implementations, and evaluation is fragmented. We present In-Situ Eval, a unified and reproducible framework that operationalizes the full RAG pipeline with configurable subsampling strategies and both RAG-specific and generic evaluation metrics. The platform supports two execution modes: an offline Dataset mode for evaluating precomputed outputs, and a live Retrieval mode for benchmarking RAG variants with state-of-the-art LLMs. Users can flexibly select datasets, retrieval techniques, models, and metrics, enabling side-by-side comparisons, ablations, and targeted analyses. This holistic approach reduces computational costs, clarifies the impact of subsampling techniques, and provides actionable insights for real-world deployments. By facilitating transparent, customizable, and interactive benchmarking, In-Situ Eval empowers both researchers and practitioners to make informed decisions in adapting RAG pipelines to domain-specific needs.

Subject: AAAI.2026 - Demonstration Track

#17 Federated Learning Playground [PDF] [Copy] [Kimi] [REL]

Authors: Bryan Shan Guanrong, Alysa Ziying Tan, Han Yu

We present Federated Learning Playground, an interactive browser-based platform inspired by and extends TensorFlow Playground that teaches core Federated Learning (FL) concepts. Users can experiment with heterogeneous client data distributions, model hyperparameters, and aggregation algorithms directly in the browser without coding or system setup, and observe their effects on client and global models through real-time visualizations, gaining intuition for challenges such as non-IID data, local overfitting, and scalability. The playground serves as an easy to use educational tool, lowering the entry barrier for newcomers to distributed AI while also offering a sandbox for rapidly prototyping and comparing FL methods. By democratizing exploration of FL, it promotes broader understanding and adoption of this important paradigm.

Subject: AAAI.2026 - Demonstration Track

#18 TPR: A Training Procedure Representation to Augment XR Simulations with LLMs [PDF] [Copy] [Kimi] [REL]

Authors: Michael Guevarra, Christabel Wayllace, Srijita Das, Carrie Demmans Epp, Alan Tay

Extended reality (XR) is well suited to support the situated learning of technical procedures. At the same time, AI-driven intelligent tutoring systems (ITS) can complement XR by providing adaptive pedagogical support. Many domains would benefit from this combination, especially when trainers, equipment, or team members are limited. We present a domain-agnostic XR-based ITS that integrates a training procedure representation (TPR), XR simulation, and an LLM-driven instructor. We demonstrate the tutor's use for tissue sample handling and engine repair, showing how it delivers adaptive feedback, collaborative roleplay, and dynamic scenario management to create realistic and pedagogically meaningful training experiences.

Subject: AAAI.2026 - Demonstration Track

#19 3D4D: An Interactive, Editable, 4D World Model via 3D Video Generation [PDF] [Copy] [Kimi] [REL]

Authors: Yunhong He, Zhengqing Yuan, Zhengzhong Tu, Yanfang Ye, Lichao Sun

We introduce DreamLand, an interactive 4D visualization framework that integrates WebGL with Supersplat rendering. It transforms static images and text into coherent 4D scenes through four core modules and employs a foveated rendering strategy for efficient, real-time multi-modal interaction. This framework enables adaptive, user-driven exploration of complex 4D environments.

Subject: AAAI.2026 - Demonstration Track

#20 Auto-BenchmarkCard: Automated Synthesis of Benchmark Documentation [PDF] [Copy] [Kimi] [REL]

Authors: Aris Hofmann, Inge Vejsbjerg, Dhaval Salwala, Elizabeth M. Daly

We present Auto-BenchmarkCard, a workflow for generating validated descriptions of AI benchmarks. Benchmark documentation is often incomplete or inconsistent, making it difficult to interpret and compare benchmarks across tasks or domains. Auto-BenchmarkCard addresses this gap by combining multi-agent data extraction from heterogeneous sources (e.g., Hugging Face, Unitxt, academic papers) with LLM-driven synthesis. A validation phase evaluates factual accuracy through atomic entailment scoring using the FactReasoner tool. This workflow has the potential to promote transparency, comparability, and reusability in AI benchmark reporting, enabling researchers and practitioners to better navigate and evaluate benchmark choices.

Subject: AAAI.2026 - Demonstration Track

#21 RAPID: A Rapid Prototyping Platform for Industrial Automation [PDF] [Copy] [Kimi] [REL]

Authors: Sunghoon Hong, Junseok Park, Whiyoung Jung, Deunsol Yoon, Woohyung Lim, Soonyoung Lee, Kanghoon Lee

Industrial automation in smart logistics and factories requires simulation platforms that support rapid environment building before costly physical deployment. Yet existing tools often require substantial expertise, complex setup, and long configuration times, hindering agile prototyping. We present RAPID, a simulation platform with two components: layout design, which enables intuitive visual configuration of factory layouts, and behavior simulation and validation, which allows users to attach behavior models and evaluate system performance. RAPID lowers the entry barrier to industrial simulation, letting users apply existing behavior models or trained reinforcement learning (RL) agents to new layouts with minimal effort. This approach lets practitioners prototype facilities in minutes rather than weeks and gives researchers a standardized environment for benchmarking multi-agent RL and coordination algorithms. By combining rapid design with simulation-based validation, RAPID accelerates automation development from concept to implementation.

Subject: AAAI.2026 - Demonstration Track

#22 GPTKB v1.5: A Massive Knowledge Base for Exploring Factual LLM Knowledge [PDF] [Copy] [Kimi] [REL]

Authors: Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Moritz Müller, Simon Razniewski

Language models are powerful artifacts, yet their factual knowledge is still poorly understood, and inaccessible to ad-hoc browsing and scalable statistical analysis. This demonstration introduces GPTKB v1.5, a densely interlinked 100-million-triple knowledge base (KB) built for $14,000 from GPT-4.1, using the GPTKB methodology for massive-recursive LLM knowledge materialization. This demo focuses on three use cases: (1) link-traversal-based LLM knowledge exploration, (2) SPARQL-based structured LLM knowledge querying, (3) comparative exploration of the strengths and weaknesses of LLM knowledge. Massive-recursive LLM knowledge materialization is a groundbreaking opportunity both for the systematic analysis of LLM knowledge, as well as for automated KB construction.

Subject: AAAI.2026 - Demonstration Track

#23 Evaluating the Factuality of Large Language Models Using Multiple Plug-and-Play Fact Sources [PDF] [Copy] [Kimi] [REL]

Authors: Zhaoheng Huang, Yutao Zhu, Jirong Wen, Zhicheng Dou

Large language models (LLMs) often produce factually inaccurate content, or hallucinations, which undermines their reliability. Existing factuality evaluation systems usually rely on a single predefined fact source, making them task-specific and hard to extend. We present UFO, a unified framework for factuality evaluation that supports multiple plug-and-play fact sources. UFO integrates human-written evidence, web search results, and LLM knowledge within a single evaluation pipeline, and allows users to flexibly select, reorder, and even define customized sources. The system is accessible through both a Python interface and a web-based demo, offering interactive claim-level verification and visualization. Experiments show that UFO system achieves moderate consistency with human annotations. Overall, UFO serves as a transparent and extensible platform for benchmarking fact sources, comparing LLMs, and enabling real-world fact-checking applications across diverse domains.

Subject: AAAI.2026 - Demonstration Track

#24 AirNavigation: Let UAV Navigation Tell Its Own Story [PDF] [Copy] [Kimi] [REL]

Authors: Jianyu Jiang, Zequan Wang, Liang Yao, Shengxiang Xu, Fan Liu

Testing autonomous navigation algorithms of Unmanned Aerial Vehicles (UAVs) in real-world scenarios often entails significant safety risks. In this paper, we aim to build a flexible yet user-friendly UAV autonomous navigation simulator. Ideally, it should closely emulate real-world environments, support diverse UAV models and algorithms, and provide a flexible evaluation framework. Existing frameworks fail to satisfy all three requirements simultaneously. To this end, we present AirNavigation, an integrated simulation platform designed to support the end-to-end workflow of UAV navigation research. Specifically, our system leverages Unreal Engine to simulate highly realistic environments and diverse UAV models. It further facilitates semi-automated scene generation and multi-modal synthetic training data production. To lower the barrier of adoption, we develop a suite of user-friendly interfaces to enable seamless integration of diverse navigation algorithms. Moreover, we introduce a novel evaluation system powered by large language models to deliver personalized and fine-grained performance analysis.

Subject: AAAI.2026 - Demonstration Track

#25 GeoProblem Factory: A Visual Interaction System for Solvable and Controllable Geometric Problem Generation by Leveraging Symbolic Deduction Engine [PDF] [Copy] [Kimi] [REL]

Authors: Zhuoxuan Jiang, Yanpeng Li, Tianyang Zhang, Jing Chen, Yong Li, Mo Guang, Wen Si, Shaohua Zhang

We propose a novel system, GeoProblem Factory, designed to effectively generate high-quality geometry problems for intelligent education. The system enables to efficiently produce batches of geometry problems for teachers and students, either to save time and manual effort or to support personalized learning. Generating geometry problems is particularly challenging, as it requires ensuring both solvability and controllability from a pedagogical perspective. To address these issues, we adopt a state-of-the-art pipeline method based on a symbolic deduction engine and develop a visual interaction demo. This demo allows users to easily refine the generated problems through visual operations. It provides two modes for inputting controllable information: specifying knowledge points or supplying a reference problem. Moreover, the system can automatically generate a preliminary geometric diagram corresponding to each problem for further refinement. Through human–machine interaction, the system can more efficiently produce high-quality geometry problems than ever.

Subject: AAAI.2026 - Demonstration Track