AAAI.2019 - IAAI | Cool Papers - Immersive Paper Discovery

#1 A Genetic Algorithm for Finding a Small and Diverse Set of Recent News Stories on a Given Subject: How We Generate AAAI’s AI-Alert [PDF] [Copy] [Kimi] [REL]

Authors: Joshua Eckroth ; Eric Schoen

This paper describes the genetic algorithm used to select news stories about artificial intelligence for AAAI’s weekly AIAlert, emailed to nearly 11,000 subscribers. Each week, about 1,500 news stories covering various aspects of artificial intelligence and machine learning are discovered by i2k Connect’s NewsFinder agent. Our challenge is to select just 10 stories from this collection that represent the important news about AI. Since stories and topics do not necessarily repeat in later weeks, we cannot use click tracking and supervised learning to predict which stories or topics are most preferred by readers. Instead, we must build a representative selection of stories a priori, using information about each story’s topics, content, publisher, date of publication, and other features. This paper describes a genetic algorithm that achieves this task. We demonstrate its effectiveness by comparing several engagement metrics from six months of “A/B testing” experiments that compare random story selection vs. a simple scoring algorithm vs. our new genetic algorithm.

#2 Large Scale Personalized Categorization of Financial Transactions [PDF] [Copy] [Kimi] [REL]

Authors: Christopher Lesner ; Alexander Ran ; Marko Rukonic ; Wei Wang

A major part of financial accounting involves tracking and organizing business transactions over and over each month and hence automation of this task is of significant value to the users of accounting software. In this paper we present a large-scale recommendation system that successfully recommends company specific categories for several million small businesses in US, UK, Australia, Canada, India and France and handles billions of financial transactions each year. Our system uses machine learning to combine fragments of information from millions of users in a manner that allows us to accurately recommend user-specific Chart of Accounts categories. Accounts are handled even if named using abbreviations or in a foreign language. Transactions are handled even if a given user has never categorized a transaction like that before. The development of such a system and testing it at scale over billions of transactions is a first in the financial industry.

#3 Transforming Underwriting in the Life Insurance Industry [PDF] [Copy] [Kimi] [REL]

Authors: Marc Maier ; Hayley Carlotto ; Freddie Sanchez ; Sherriff Balogun ; Sears Merritt

Life insurance provides trillions of dollars of financial security for hundreds of millions of individuals and families worldwide. Life insurance companies must accurately assess individual-level mortality risk to simultaneously maintain financial strength and price their products competitively. The traditional underwriting process used to assess this risk is based on manually examining an applicant’s health, behavioral, and financial profile. The existence of large historical data sets provides an unprecedented opportunity for artificial intelligence and machine learning to transform underwriting in the life insurance industry. We present an overview of how a rich application data set and survival modeling were combined to develop a life score that has been deployed in an algorithmic underwriting system at MassMutual, an American mutual life insurance company serving millions of clients. Through a novel evaluation framework, we show that the life score outperforms traditional underwriting by 6% on the basis of claims. We describe how engagement with actuaries, medical doctors, underwriters, and reinsurers was paramount to building an algorithmic underwriting system with a predictive model at its core. Finally, we provide details of the deployed system and highlight its value, which includes saving millions of dollars in operational efficiency while driving the decisions behind tens of billions of dollars of benefits.

#4 Automated Dispatch of Helpdesk Email Tickets: Pushing the Limits with AI [PDF] [Copy] [Kimi] [REL]

Authors: Atri Mandal ; Nikhil Malhotra ; Shivali Agarwal ; Anupama Ray ; Giriprasad Sridhara

Ticket assignment/dispatch is a crucial part of service delivery business with lot of scope for automation and optimization. In this paper, we present an end-to-end automated helpdesk email ticket assignment system, which is also offered as a service. The objective of the system is to determine the nature of the problem mentioned in an incoming email ticket and then automatically dispatch it to an appropriate resolver group (or team) for resolution. The proposed system uses an ensemble classifier augmented with a configurable rule engine. While design of a classifier that is accurate is one of the main challenges, we also need to address the need of designing a system that is robust and adaptive to changing business needs. We discuss some of the main design challenges associated with email ticket assignment automation and how we solve them. The design decisions for our system are driven by high accuracy, coverage, business continuity, scalability and optimal usage of computational resources. Our system has been deployed in production of three major service providers and currently assigning over 90,000 emails per month, on an average, with an accuracy close to 90% and covering at least 90% of email tickets. This translates to achieving human-level accuracy and results in a net saving of more than 50000 man-hours of effort per annum. Till date, our deployed system has already served more than 700,000 tickets in production.

#5 Grading Uncompilable Programs [PDF] [Copy] [Kimi] [REL]

Authors: Rohit Takhar ; Varun Aggarwal

Evaluators wish to test candidates on their ability to propose the correct algorithmic approach to solve programming problems. Recently, several automated systems for grading programs have been proposed, but none of them address uncompilable codes. We present the first approach to grade uncompilable codes and provide semantic feedback on them using machine learning. We propose two methods that allow us to derive informative semantic features from programs. One of this approach makes the program compilable by correcting errors, while the other relaxes syntax/grammar rules to help parse uncompilable codes. We compare the relative efficacy of these approaches towards grading. We finally combine them to build an algorithm which rivals the accuracy of experts in grading programs. Additionally, we show that the models learned for compilable codes can be reused for uncompilable codes. We present case studies, where companies are able to hire more efficiently by deploying our technology.

#6 Remote Management of Boundary Protection Devices with Information Restrictions [PDF] [Copy] [Kimi] [REL]

Authors: Aaron Adler ; Peter Samouelian ; Michael Atighetchi ; Yat Fu

Boundary Protection Devices (BPDs) are used by US Government mission partners to regulate the flow of information across networks of differing security levels. BPDs provide several critical functions, including preventing unauthorized sharing, sanitizing information, and preventing cyber attacks. Their application in national security and critical infrastructure environments (e.g., military missions, nuclear power plants, clean water distribution systems) calls for a comprehensive load monitoring system that provides resilience and scalability, as well as an automated and vendor neutral configuration management system that can efficiently respond to security threats at machine speed. Their design as one-way traffic control systems, however, presents challenges for dynamic load adaptation techniques that require access to application server performance metrics across network boundaries. Moreover, the structured review and approval process that regulates their configuration and use presents two significant challenges: (1) Adaptation techniques that alter the configuration of BPDs must be predictable, understandable, and pre-approved by administrators, and (2) Software can be installed on BPDs only after completing a stringent accreditation process. These challenges often lead to manual configuration management practices, which are inefficient or ineffective in many cases. The Hammerhead prototype, developed as part of the SHARC project, addresses these challenges using knowledge representation, a rule-oriented adaptation bundle format, and an extensible, open-source constraint solver.

#7 Linking Educational Resources on Data Science [PDF] [Copy] [Kimi] [REL]

Authors: José Luis Ambite ; Jonathan Gordon ; Lily Fierro ; Gully Burns ; Joel Mathew

The availability of massive datasets in genetics, neuroimaging, mobile health, and other subfields of biology and medicine promises new insights but also poses significant challenges. To realize the potential of big data in biomedicine, the National Institutes of Health launched the Big Data to Knowledge (BD2K) initiative, funding several centers of excellence in biomedical data analysis and a Training Coordinating Center (TCC) tasked with facilitating online and inperson training of biomedical researchers in data science. A major initiative of the BD2K TCC is to automatically identify, describe, and organize data science training resources available on the Web and provide personalized training paths for users. In this paper, we describe the construction of ERuDIte, the Educational Resource Discovery Index for Data Science, and its release as linked data. ERuDIte contains over 11,000 training resources including courses, video tutorials, conference talks, and other materials. The metadata for these resources is described uniformly using Schema.org. We use machine learning techniques to tag each resource with concepts from the Data Science Education Ontology, which we developed to further describe resource content. Finally, we map references to people and organizations in learning resources to entities in DBpedia, DBLP, and ORCID, embedding our collection in the web of linked data. We hope that ERuDIte will provide a framework to foster open linked educational resources on the Web.

#8 Early-Stopping of Scattering Pattern Observation with Bayesian Modeling [PDF] [Copy] [Kimi] [REL]

Authors: Akinori Asahara ; Hidekazu Morita ; Chiharu Mitsumata ; Kanta Ono ; Masao Yano ; Tetsuya Shoji

This paper describes a new machine-learning application to speed up Small-angle neutron scattering (SANS) experiments, and its method based on probabilistic modeling. SANS is one of the scattering experiments to observe microstructures of materials; in it, two-dimensional patterns on a plane (SANS pattern) are obtained as measurements. It takes a long time to obtain accurate experimental results because the SANS pattern is a histogram of detected neutrons. For shortening the measurement time, we propose an earlystopping method based on Gaussian mixture modeling with a prior generated from B-spline regression results. An experiment using actual SANS data was carried out to examine the accuracy of the method. It was confirmed that the accuracy with the proposed method converged 4 minutes after starting the experiment (normal SANS takes about 20 minutes).

#9 Querying NoSQL with Deep Learning to Answer Natural Language Questions [PDF] [Copy] [Kimi] [REL]

Authors: Sebastian Blank ; Florian Wilhelm ; Hans-Peter Zorn ; Achim Rettinger

Almost all of today’s knowledge is stored in databases and thus can only be accessed with the help of domain specific query languages, strongly limiting the number of people which can access the data. In this work, we demonstrate an end-to-end trainable question answering (QA) system that allows a user to query an external NoSQL database by using natural language. A major challenge of such a system is the non-differentiability of database operations which we overcome by applying policy-based reinforcement learning. We evaluate our approach on Facebook’s bAbI Movie Dialog dataset and achieve a competitive score of 84.2% compared to several benchmark models. We conclude that our approach excels with regard to real-world scenarios where knowledge resides in external databases and intermediate labels are too costly to gather for non-end-to-end trainable QA systems.

#10 Probabilistic-Logic Bots for Efficient Evaluation of Business Rules Using Conversational Interfaces [PDF] [Copy] [Kimi] [REL]

Authors: Joseph Bockhorst ; Devin Conathan ; Glenn M Fung

We present an approach for designing conversational interfaces (chatbots) that users interact with to determine whether or not a business rule applies in a context possessing uncertainty (from the point of view of the chatbot) as to the value of input facts. Our approach relies on Bayesian network models that bring together a business rule’s logical, deterministic aspects with its probabilistic components in a common framework. Our probabilistic-logic bots (PL-bots) evaluate business rules by iteratively prompting users to provide the values of unknown facts. The order facts are solicited is dynamic, depends on known facts, and is chosen using mutual information as a heuristic so as to minimize the number of interactions with the user. We have created a web-based content creation and editing tool that quickly enables subject matter experts to create and validate PL-bots with minimal training and without requiring a deep understanding of logic or probability. To date, domain experts at a well-known insurance company have successfully created and deployed over 80 PLbots to help insurance agents determine customer eligibility for policy discounts and endorsements.

#11 Anomaly Detection Using Autoencoders in High Performance Computing Systems [PDF] [Copy] [Kimi] [REL]

Authors: Andrea Borghesi ; Andrea Bartolini ; Michele Lombardi ; Michela Milano ; Luca Benini

Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components. The current state of the art for automated anomaly detection employs Machine Learning methods or statistical regression models in a supervised fashion, meaning that the detection tool is trained to distinguish among a fixed set of behaviour classes (healthy and unhealthy states). We propose a novel approach for anomaly detection in High Performance Computing systems based on a Machine (Deep) Learning technique, namely a type of neural network called autoencoder. The key idea is to train a set of autoencoders to learn the normal (healthy) behaviour of the supercomputer nodes and, after training, use them to identify abnormal conditions. This is different from previous approaches which where based on learning the abnormal condition, for which there are much smaller datasets (since it is very hard to identify them to begin with). We test our approach on a real supercomputer equipped with a fine-grained, scalable monitoring infrastructure that can provide large amount of data to characterize the system behaviour. The results are extremely promising: after the training phase to learn the normal system behaviour, our method is capable of detecting anomalies that have never been seen before with a very good accuracy (values ranging between 88% and 96%).

#12 A Fast Machine Learning Workflow for Rapid Phenotype Prediction from Whole Shotgun Metagenomes [PDF] [Copy] [Kimi] [REL]

Authors: Anna Paola Carrieri ; Will PM Rowe ; Martyn Winn ; Edward O. Pyzer-Knapp

Research on the microbiome is an emerging and crucial science that finds many applications in healthcare, food safety, precision agriculture and environmental studies. Huge amounts of DNA from microbial communities are being sequenced and analyzed by scientists interested in extracting meaningful biological information from this big data. Analyzing massive microbiome sequencing datasets, which embed the functions and interactions of thousands of different bacterial, fungal and viral species, is a significant computational challenge. Artificial intelligence has the potential for building predictive models that can provide insights for specific cutting edge applications such as guiding diagnostics and developing personalised treatments, as well as maintaining soil health and fertility. Current machine learning workflows that predict traits of host organisms from their commensal microbiome do not take into account the whole genetic material constituting the microbiome, instead basing the analysis on specific marker genes. In this paper, to the best of our knowledge, we introduce the first machine learning workflow that efficiently performs host phenotype prediction from whole shotgun metagenomes by computing similaritypreserving compact representations of the genetic material. Our workflow enables prediction tasks, such as classification and regression, from Terabytes of raw sequencing data that do not necessitate any pre-prossessing through expensive bioinformatics pipelines. We compare the performance in terms of time, accuracy and uncertainty of predictions for four different classifiers. More precisely, we demonstrate that our ML workflow can efficiently classify real data with high accuracy, using examples from dog and human metagenomic studies, representing a step forward towards real time diagnostics and a potential for cloud applications.

#13 Expert Guided Rule Based Prioritization of Scientifically Relevant Images for Downlinking over Limited Bandwidth from Planetary Orbiters [PDF] [Copy] [Kimi] [REL]

Authors: Srija Chakraborty ; Subhasish Das ; Ayan Banerjee ; Sandeep K. S. Gupta ; Philip R. Christensen

Instruments onboard spacecraft acquire large amounts of data which is to be transmitted over a very low bandwidth. Consequently for some missions, the volume of data collected greatly exceeds the volume that can be downlinked before the next orbit. This necessitates the introduction of an intelligent autonomous decision making module that maximizes the return of the most scientifically relevant dataset over the low bandwidth for experts to analyze further. We propose an iterative rule based approach, guided by expert knowledge, to represent scientifically interesting geological landforms with respect to expert selected attributes. The rules are utilized to assign a priority based on how novel a test instance is with respect to its rule. High priority instances from the test set are used to iteratively update the learned rules. We then determine the effectiveness of the proposed approach on images acquired by a Mars orbiter and observe an expert-acceptable prioritization order generated by the rules that can potentially increase the return of scientifically relevant observations.

#14 DeBGUer: A Tool for Bug Prediction and Diagnosis [PDF] [Copy] [Kimi] [REL]

Authors: Amir Elmishali ; Roni Stern ; Meir Kalech

In this paper, we present the DeBGUer tool, a web-based tool for prediction and isolation of software bugs. DeBGUer is a partial implementation of the Learn, Diagnose, and Plan (LDP) paradigm, which is a recently introduced paradigm for integrating Artificial Intelligence (AI) in the software bug detection and correction process. In LDP, a diagnosis (DX) algorithm is used to suggest possible explanations – diagnoses – for an observed bug. If needed, a test planning algorithm is subsequently used to suggest further testing. Both diagnosis and test planning algorithms consider a fault prediction model, which associates each software component (e.g., class or method) with the likelihood that it contains a bug. DeBGUer implements the first two components of LDP, bug prediction (Learn) and bug diagnosis (Diagnose). It provides an easy-to-use web interface, and has been successfully tested on 12 projects.

#15 Satellite Detection of Moving Vessels in Marine Environments [PDF] [Copy] [Kimi] [REL]

Authors: Natalie Fridman ; Doron Amir ; Yinon Douchan ; Noa Agmon

There is a growing need for coverage of large maritime areas, mainly in the exclusive economic zone (EEZ). Due to the difficulty of accessing such large areas, the use of satellite based sensors is the most efficient and cost-effective way to perform this task. Vessel behavior prediction is a necessary ability for detection of moving vessels with satellite imagery. In this paper we present an algorithm for selection of the best satellite observation window to detect a moving object. First, we describe a model for vessel behavior prediction and compare its performance to two base models. We use real marine traffic data (AIS) to compare their ability to predict vessel behavior in a time frame of between 1–24 hours. Then, we present a KINGFISHER, maritime intelligence system which uses our algorithm to track suspected vessels with satellite sensor. We also present the results of the algorithm in operational scenarios of the KINGFISHER.

#16 Automatic Generation of Chinese Short Product Titles for Mobile Display [PDF] [Copy] [Kimi] [REL]

Authors: Yu Gong ; Xusheng Luo ; Kenny Q. Zhu ; Wenwu Ou ; Zhao Li ; Lu Duan

This paper studies the problem of automatically extracting a short title from a manually written longer description of Ecommerce products for display on mobile devices. It is a new extractive summarization problem on short text inputs, for which we propose a feature-enriched network model, combining three different categories of features in parallel. Experimental results show that our framework significantly outperforms several baselines by a substantial gain of 4.5%. Moreover, we produce an extractive summarization dataset for Ecommerce short texts and will release it to the research community.

#17 Logistic Regression on Homomorphic Encrypted Data at Scale [PDF] [Copy] [Kimi] [REL]

Authors: Kyoohyung Han ; Seungwan Hong ; Jung Hee Cheon ; Daejun Park

Machine learning on (homomorphic) encrypted data is a cryptographic method for analyzing private and/or sensitive data while keeping privacy. In the training phase, it takes as input an encrypted training data and outputs an encrypted model without ever decrypting. In the prediction phase, it uses the encrypted model to predict results on new encrypted data. In each phase, no decryption key is needed, and thus the data privacy is ultimately guaranteed. It has many applications in various areas such as finance, education, genomics, and medical field that have sensitive private data. While several studies have been reported on the prediction phase, few studies have been conducted on the training phase. In this paper, we present an efficient algorithm for logistic regression on homomorphic encrypted data, and evaluate our algorithm on real financial data consisting of 422,108 samples over 200 features. Our experiment shows that an encrypted model with a sufficient Kolmogorov Smirnow statistic value can be obtained in ∼17 hours in a single machine. We also evaluate our algorithm on the public MNIST dataset, and it takes ∼2 hours to learn an encrypted model with 96.4% accuracy. Considering the inefficiency of homomorphic encryption, our result is encouraging and demonstrates the practical feasibility of the logistic regression training on large encrypted data, for the first time to the best of our knowledge.

#18 A Machine Learning Suite for Machine Components’ Health-Monitoring [PDF] [Copy] [Kimi] [REL]

Authors: Ramin Hasani ; Guodong Wang ; Radu Grosu

This paper studies an intelligent technique for the healthmonitoring and prognostics of common rotary machine components, with regards to bearings in particular. During a run-to-failure experiment, rich unsupervised features from vibration sensory data are extracted by a trained sparse autoencoder. Then, the correlation of the initial samples (presumably healthy), along with the successive samples, are calculated and passed through a moving-average filter. The normalized output which is referred to as the auto-encoder correlation based (AEC) rate, determines an informative attribute of the system, depicting its health status. AEC automatically identifies the degradation starting point in the machine component. We show that AEC rate well-generalizes in several run-tofailure tests. We demonstrate the superiority of the AEC over many other state-of-the-art approaches for the health monitoring of machine bearings.

#19 Leveraging Textual Specifications for Grammar-Based Fuzzing of Network Protocols [PDF] [Copy] [Kimi] [REL]

Authors: Samuel Jero ; Maria Leonor Pacheco ; Dan Goldwasser ; Cristina Nita-Rotaru

Grammar-based fuzzing is a technique used to find software vulnerabilities by injecting well-formed inputs generated following rules that encode application semantics. Most grammar-based fuzzers for network protocols rely on human experts to manually specify these rules. In this work we study automated learning of protocol rules from textual specifications (i.e. RFCs). We evaluate the automatically extracted protocol rules by applying them to a state-of-the-art fuzzer for transport protocols and show that it leads to a smaller number of test cases while finding the same attacks as the system that uses manually specified rules.

#20 Novelty Detection for Multispectral Images with Application to Planetary Exploration [PDF] [Copy] [Kimi] [REL]

Authors: Hannah R Kerner ; Danika F Wellington ; Kiri L Wagstaff ; James F Bell ; Chiman Kwan ; Heni Ben Amor

In this work, we present a system based on convolutional autoencoders for detecting novel features in multispectral images. We introduce SAMMIE: Selections based on Autoencoder Modeling of Multispectral Image Expectations. Previous work using autoencoders employed the scalar reconstruction error to classify new images as novel or typical. We show that a spatial-spectral error map can enable both accurate classification of novelty in multispectral images as well as human-comprehensible explanations of the detection. We apply our methodology to the detection of novel geologic features in multispectral images of the Martian surface collected by the Mastcam imaging system on the Mars Science Laboratory Curiosity rover.

#21 Robust Multi-Object Detection Based on Data Augmentation with Realistic Image Synthesis for Point-of-Sale Automation [PDF] [Copy] [Kimi] [REL]

Authors: Saiprasad Koturwar ; Soma Shiraishi ; Kota Iwamoto

As an alternative to bar-code scanning, we are developing a real-time retail product detector for point-of-sale automation. The major challenge associated with image based object detection arise from occlusion and the presence of other objects in close proximity. For robust product detection under such conditions, it is crucial to train the detector on a rich set of images with varying degrees of occlusion and proximity between the products, which fairly represents a wide range of customer tendencies of placing products together. However, generating a fairly large database of such images traditionally requires a large amount of human effort. On the other hand, acquiring individual object images with their corresponding masks is a relatively easy task. We propose an realistic image synthesis approach which uses individual object images and their corresponding masks to create training images with desired properties (occlusion and congestion among the products). We train our product detector over images thus generated and achieve a consistent performance improvement across different types of test data. With the proposed approach, detector achieves an improvement of 46.2% (from 0.67 to 0.98) and 40% (from 0.60 to 0.84) over precision and recall respectively, compared to using a basic training dataset containing one product per image.

#22 VPDS: An AI-Based Automated Vehicle Occupancy and Violation Detection System [PDF] [Copy] [Kimi] [REL]

Authors: Abhinav Kumar ; Aishwarya Gupta ; Bishal Santra ; KS Lalitha ; Manasa Kolla ; Mayank Gupta ; Rishabh Singh

High Occupancy Vehicle/High Occupancy Tolling (HOV/HOT) lanes are operated based on voluntary HOV declarations by drivers. A majority of these declarations are wrong to leverage faster HOV lane speeds illegally. It is a herculean task to manually regulate HOV lanes and identify these violators. Therefore, an automated way of counting the number of people in a car is prudent for fair tolling and for violator detection. In this paper, we propose a Vehicle Passenger Detection System (VPDS) which works by capturing images through Near Infrared (NIR) cameras on the toll lanes and processing them using deep Convolutional Neural Networks (CNN) models. Our system has been deployed in 3 cities over a span of two years and has served roughly 30 million vehicles with an accuracy of 97% which is a remarkable improvement over manual review which is 37% accurate. Our system can generate an accurate report of HOV lane usage which helps policy makers pave the way towards de-congestion.

#23 Profiles, Proxies, and Assumptions: Decentralized, Communications-Resilient Planning, Allocation, and Scheduling [PDF] [Copy] [Kimi] [REL]

Authors: Ugur Kuter ; Brian Kettler ; Katherine Guo ; Martin Hofmann ; Valerie Champagne ; Kurt Lachevet ; Jennifer Lautenschlager ; Robert P. Goldman ; Luis Asencios ; Josh Hamell

Degraded communications are expected in large-scale disaster response and military operations, which nevertheless require rapid, concerted actions by distributed decision makers, each with limited visibility into the changing situation and in charge of a limited set of resources. We describe LAPLATA, a novel architecture that addresses these challenges by separating mission planning from allocation/scheduling for scalability but at the cost of some negotiation. We describe formal algorithms that achieve near-optimal performance according to mission completion percentage and subject matter expert review: assumption-based planning and replanning, profileassisted cooperative allocation, and schedule negotiation. We validate our approach on a realistic problem specification and compare results against subject matter expert solutions.

#24 Feature Isolation for Hypothesis Testing in Retinal Imaging: An Ischemic Stroke Prediction Case Study [PDF] [Copy] [Kimi] [REL]

Authors: Gilbert Lim ; Zhan Wei Lim ; Dejiang Xu ; Daniel S.W. Ting ; Tien Yin Wong ; Mong Li Lee ; Wynne Hsu

Ischemic stroke is a leading cause of death and long-term disability that is difficult to predict reliably. Retinal fundus photography has been proposed for stroke risk assessment, due to its non-invasiveness and the similarity between retinal and cerebral microcirculations, with past studies claiming a correlation between venular caliber and stroke risk. However, it may be that other retinal features are more appropriate. In this paper, extensive experiments with deep learning on six retinal datasets are described. Feature isolation involving segmented vascular tree images is applied to establish the effectiveness of vessel caliber and shape alone for stroke classification, and dataset ablation is applied to investigate model generalizability on unseen sources. The results suggest that vessel caliber and shape could be indicative of ischemic stroke, and sourcespecific features could influence model performance.

#25 Building Trust in Deep Learning System towards Automated Disease Detection [PDF] [Copy] [Kimi] [REL]

Authors: Zhan Wei Lim ; Mong Li Lee ; Wynne Hsu ; Tien Yin Wong

Though deep learning systems have achieved high accuracy in detecting diseases from medical images, few such systems have been deployed in highly automated disease screening settings due to lack of trust in how well these systems can generalize to out-of-datasets. We propose to use uncertainty estimates of the deep learning system’s prediction to know when to accept or to disregard its prediction. We evaluate the effectiveness of using such estimates in a real-life application for the screening of diabetic retinopathy. We also generate visual explanation of the deep learning system to convey the pixels in the image that influences its decision. Together, these reveal the deep learning system’s competency and limits to the human, and in turn the human can know when to trust the deep learning system.