| Total: 34
The opacity of deep neural networks remains a challenge in deploying solutions where explanation is as important as precision. We present ConceptX, a human-in-the-loop framework for interpreting and annotating latent representational space in pre-trained Language Models (pLMs). We use an unsupervised method to discover concepts learned in these models and enable a graphical interface for humans to generate explanations for the concepts. To facilitate the process, we provide auto-annotations of the concepts (based on traditional linguistic ontologies). Such annotations enable development of a linguistic resource that directly represents latent concepts learned within deep NLP models. These include not just traditional linguistic concepts, but also task-specific or sensitive concepts (words grouped based on gender or religious connotation) that helps the annotators to mark bias in the model. The framework consists of two parts (i) concept discovery and (ii) annotation platform.
This demonstration introduces SOREO, a system that explores the possibility of extending UAVs autonomy through machine learning. It brings a contribution to the following problem: Having a fleet of drones and a geographic area, how to learn the shortest paths between any point with regards to the base points for optimal and safe package delivery? Starting from a set of possible actions, a virtual design of a geographic location of interest, e.g., a city, and a reward value, SOREO is capable of learning not only how to prevent collisions with obstacles, e.g., walls and buildings, but also to find the shortest path between any two points, i.e., the base and the target. SOREO exploits based on the Q-learning algorithm.
A common musical composition practice is to develop musical pieces using variations of musical themes. In this study, we present an interactive tool which can generate variations of musical themes in real-time using a variational autoencoder model. Our tool is controllable using semantically meaningful musical attributes via latent space regularisation technique to increase the explainability of the model. The tool is integrated into an industry standard digital audio workstation - Ableton Live - using the Max4Live device framework and can run locally on an average personal CPU rather than requiring a costly GPU cluster. In this way we demonstrate how cutting-edge AI research can be integrated into the exiting workflows of professional and practising musicians for use in the real-world beyond the research lab.
We demonstrate Dagster, a system that implements a new approach to scheduling interdependent (Boolean) SAT search activities in high-performance computing (HPC) environments. Our system takes as input a set of disjunctive clauses (i.e., DIMACS CNF) and a labelled directed acyclic graph (DAG) structure describing how the clauses are decomposed into a set of interrelated problems. Component problems are solved using standard systematic backtracking search, which may optionally be coupled to (stochastic dynamic) local search and/or clause-strengthening processes. We demonstrate Dagster using a new Graph Maximal Determinant combinatorial case study. This demonstration paper presents a new case study, and is adjunct to the longer accepted manuscript at the Pacific Rim International Conference on Artificial Intelligence (2022).
This paper presents AI-SNIPS (AI Support for Network Intelligence-based Pharmaceutical Security), a production-ready platform that enables stakeholder decision-making, secure data sharing, and interdisciplinary research in the fight against Illicit, Substandard, and Falsified Medical Products (ISFMP). AI-SNIPS takes as input cases: a case consists of one or more URLs suspected of ISFMP activity. Cases can be supplemented with ground-truth structured data (labeled keywords) such as seller PII or case notes. First, AI-SNIPS scrapes and stores relevant images and text from the provided URLs without any user intervention. Salient features for predicting case similarity are extracted from the aggregated data using a combination of rule-based and machine-learning techniques and used to construct a seller network, with the nodes representing cases (sellers) and the edges representing the similarity between two sellers. Network analysis and community detection techniques are applied to extract seller clusters ranked by profitability and their potential to harm society. Lastly, AI-SNIPS provides interpretability by distilling common word/image similarities for each cluster into signature vectors. We validate the importance of AI-SNIPS's features for distinguishing large pharmaceutical affiliate networks from small ISFMP operations using an actual ISFMP lead sheet.
Given a million-scale dataset of who-calls-whom data containing imperfect labels, how can we detect existing and new fraud patterns? We propose TgrApp, which extracts carefully designed features and provides visualizations to assist analysts in spotting fraudsters and suspicious behavior. Our TgrApp method has the following properties: (a) Scalable, as it is linear on the input size; and (b) Effective, as it allows natural interaction with human analysts, and is applicable in both supervised and unsupervised settings.
In this paper, we propose Tutoring bot, a generative chatbot trained on a large scale of tutor-student conversations for English-language learning. To mimic a human tutor's behavior in language education, the tutor bot leverages diverse educational instructions and grounds to each instruction as additional input context for the tutor response generation. As a single instruction generally involves multiple dialogue turns to give the student sufficient speaking practice, the tutor bot is required to monitor and capture when the current instruction should be kept or switched to the next instruction. For that, the tutor bot is learned to not only generate responses but also infer its teaching action and progress on the current conversation simultaneously by a multi-task learning scheme. Our Tutoring bot is deployed under a non-commercial use license at https://tutoringai.com.
Machine learning prediction APIs offered by Google, Microsoft, Amazon, and many other providers have been continuously adopted in a plethora of applications, such as visual object detection, natural language comprehension, and speech recognition. Despite the importance of a systematic study and comparison of different APIs over time, this topic is currently under-explored because of the lack of data and user-friendly exploration tools. To address this issue, we present HAPI Explorer (History of API Explorer), an interactive system that offers easy access to millions of instances of commercial API applications collected in three years, prioritize attention on user-defined instance regimes, and explain interesting patterns across different APIs, subpopulations, and time periods via visual and natural languages. HAPI Explorer can facilitate further comprehension and exploitation of ML prediction APIs.
We present EasyRec, an easy-to-use, extendable and efficient recommendation framework for building industrial recommendation systems. Our EasyRec framework is superior in the following aspects:first, EasyRec adopts a modular and pluggable design pattern to reduce the efforts to build custom models; second, EasyRec implements hyper-parameter optimization and feature selection algorithms to improve model performance automatically; third, EasyRec applies online learning to adapt to the ever-changing data distribution. The code is released: https://github.com/alibaba/EasyRec.
We present edBB-Demo, a demonstrator of an AI-powered research platform for student monitoring in remote education. The edBB platform aims to study the challenges associated to user recognition and behavior understanding in digital platforms. This platform has been developed for data collection, acquiring signals from a variety of sensors including keyboard, mouse, webcam, microphone, smartwatch, and an Electroencephalography band. The information captured from the sensors during the student sessions is modelled in a multimodal learning framework. The demonstrator includes: i) Biometric user authentication in an unsupervised environment; ii) Human action recognition based on remote video analysis; iii) Heart rate estimation from webcam video; and iv) Attention level estimation from facial expression analysis.
Drone based terrorist attacks are increasing daily. It is not expected to be long before drones are used to carry out terror attacks in urban areas. We have developed the DUCK multi-agent testbed that security agencies can use to simulate drone-based attacks by diverse actors and develop a combination of surveillance camera, drone, and cyber defenses against them.
This is a demonstration of our newly released Python package NL2LTL which leverages the latest in natural language understanding (NLU) and large language models (LLMs) to translate natural language instructions to linear temporal logic (LTL) formulas. This allows direct translation to formal languages that a reasoning system can use, while at the same time, allowing the end-user to provide inputs in natural language without having to understand any details of an underlying formal language. The package comes with support for a set of default LTL patterns, corresponding to popular DECLARE templates, but is also fully extensible to new formulas and user inputs. The package is open-source and is free to use for the AI community under the MIT license. Open Source: https://github.com/IBM/nl2ltl. Video Link: https://bit.ly/3dHW5b1
Political debates are one of the most salient moments of an election campaign, where candidates are challenged to discuss the main contemporary and historical issues in a country. These debates represent a natural ground for argumentative analysis, which has always been employed to investigate political discourse structure and strategy in philosophy and linguistics. In this paper, we present DISPUTool 2.0, an automated tool which relies on Argument Mining methods to analyse the political debates from the US presidential campaigns to extract argument components (i.e., premise and claim) and relations (i.e., support and attack), and highlight fallacious arguments. DISPUTool 2.0 allows also for the automatic analysis of a piece of a debate proposed by the user to identify and classify the arguments contained in the text. A REST API is provided to exploit the tool's functionalities.
Human guides in museums and galleries are professionally trained to stimulate informal learning in visitors by asking low-risk, open-ended reflective questions that enable them to focus on specific features of artifacts, relate to prior experiences, and elicit curiosity as well as further thought. We present ArtMuse, our AI-powered chatbot for asking reflective questions in context of paintings. Our reflective question generation model in ArtMuse was trained by applying a novel combination of existing models for extractive question answering and open-domain chitchat. User evaluation studies indicate that we are able to generate fluent and specific reflective questions for paintings that are highly-engaging.
We propose an AI-based pilot trainer to help students learn how to fly aircraft. First, an AI agent uses behavioral cloning to learn flying maneuvers from qualified flight instructors. Later, the system uses the agent's decisions to detect errors made by students and provide feedback to help students correct their errors. This paper presents an instantiation of the pilot trainer. We focus on teaching straight and level flying maneuvers by automatically providing formative feedback to the human student.
The Sudoku Assistant app is an AI assistant that uses a combination of machine learning and constraint programming techniques, to interpret and explain a pen-and-paper Sudoku scanned with a smartphone. Although the demo is about Sudoku, the underlying techniques are equally applicable to other constraint solving problems like timetabling, scheduling, and vehicle routing.
DataFlow has been emerging as a new paradigm for building task-oriented chatbots due to its expressive semantic representations of the dialogue tasks. Despite the availability of a large dataset SMCalFlow and a simplified syntax, the development and evaluation of DataFlow-based chatbots remain challenging due to the system complexity and the lack of downstream toolchains. In this demonstration, we present DFEE, an interactive DataFlow Execution and Evaluation toolkit that supports execution, visualization and benchmarking of semantic parsers given dialogue input and backend database. We demonstrate the system via a complex dialog task: event scheduling that involves temporal reasoning. It also supports diagnosing the parsing results via a friendly interface that allows developers to examine dynamic DataFlow and the corresponding execution results. To illustrate how to benchmark SoTA models, we propose a novel benchmark that covers more sophisticated event scheduling scenarios and a new metric on task success evaluation. The codes of DFEE have been released on https://github.com/amazonscience/dataflow-evaluation-toolkit.
With the advancement of deep learning technology, neural networks have demonstrated their excellent ability to provide accurate predictions in many tasks. However, a lack of consideration for neural network calibration will not gain trust from humans, even for high-accuracy models. In this regard, the gap between the confidence of the model's predictions and the actual correctness likelihood must be bridged to derive a well-calibrated model. In this paper, we introduce the Neural Clamping Toolkit, the first open-source framework designed to help developers employ state-of-the-art model-agnostic calibrated models. Furthermore, we provide animations and interactive sections in the demonstration to familiarize researchers with calibration in neural networks. A Colab tutorial on utilizing our toolkit is also introduced.
Couples often encounter the challenge of sharing house chores. This raises the fundamental question of how to divide chores. In this paper, we present a new application for a fair division of household chores. Our platform, called Kajibuntan, allows couples to specify the set of chores to be shared, their preferences over them, and the current allocation. Our tool visualizes the current allocation and makes proposals according to their preferences based on the theory of fair division. The goal of our tool is to provide a systematic and transparent system to divide household chores and help creating harmony in the home.
Safe and efficient maritime navigation is fundamental for autonomous surface vehicles to support many applications in the blue economy, including cargo transportation that covers 90% of the global marine industry. We developed MARCOL, a collision avoidance decision-making framework that provides safe, efficient, and explainable collision avoidance strategies and that allows for repeated experiments under diverse high-traffic scenarios.
In this work, we propose a fast convergence track net, or FC-TrackNet, based on a synthetic data-driven approach to maintaining long-term 6D pose tracking. Comparison experiments are performed on two different datasets, The results demonstrate that our approach can achieve a consistent tracking frequency of 90.9 Hz as well as higher accuracy than the state-of-the art approaches.
Improving model robustness against potential modality noise, as an essential step for adapting multimodal models to real-world applications, has received increasing attention among researchers. For Multimodal Sentiment Analysis (MSA), there is also a debate on whether multimodal models are more effective against noisy features than unimodal ones. Stressing on intuitive illustration and in-depth analysis of these concerns, we present Robust-MSA, an interactive platform that visualizes the impact of modality noise as well as simple defence methods to help researchers know better about how their models perform with imperfect real-world data.
Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types. We present a front-end demo to a multilingual machine reading comprehension system that handles boolean and extractive questions. It provides a yes/no answer and highlights the supporting evidence for boolean questions. It provides an answer for extractive questions and highlights the answer in the passage. Our system, GAAMA 2.0, achieved first place on the TyDi QA leaderboard at the time of submission. We contrast two different implementations of our approach: including multiple transformer models for easy deployment, and a shared transformer model utilizing adapters to reduce GPU memory footprint for a resource-constrained environment.
Recent studies have highlighted that private instant messaging platforms and channels are major media of cyber aggression, especially among teens. Due to the private nature of the verbal exchanges on these media, few studies have addressed the task of hate speech detection in this context. Moreover, the recent release of resources mimicking online aggression situations that may occur among teens on private instant messaging platforms is encouraging the development of solutions aiming at dealing with diversity in digital harassment. In this study, we present BiRDy: a fully Web-based platform performing participant role detection in multi-party chats. Leveraging the pre-trained language model mBERT (multilingual BERT), we release fine-tuned models relying on various contextual window strategies to classify exchanged messages according to the role of involvement in cyberbullying of the authors. Integrating a role scoring function, the proposed pipeline predicts a unique role for each chat participant. In addition, detailed confidence scoring are displayed. Currently, BiRDy publicly releases models for French and Italian.
This demo paper discusses a scalable platform for emerging Data-Driven AI Applications targeted toward predictive maintenance solutions. We propose a common AI software architecture stack for building diverse AI Applications such as Anomaly Detection, Failure Pattern Analysis, Asset Health Forecasting, etc. for more than a 100K industrial assets of similar class. As a part of the AI system demonstration, we have identified the following three key topics for discussion: Scaling model training across multiple assets, Joint execution of multiple AI applications; and Bridge the gap between current open source software tools and the emerging need for AI Applications. To demonstrate the benefits, AI Model Factory has been tested to build the models for various industrial assets such as Wind turbines, Oil wells, etc. The system is deployed on API Hub for demonstration.