AAAI.2021 - IAAI

| Total: 46

#1 Preclinical Stage Alzheimer's Disease Detection Using Magnetic Resonance Image Scans [PDF] [Copy] [Kimi] [REL]

Authors: Fatih Altay, Guillermo Ramón Sánchez, Yanli James, Stephen V. Faraone, Senem Velipasalar, Asif Salekin

Alzheimer's disease is one of the diseases that mostly affects older people without being a part of aging. The most common symptoms include problems with communicating and abstract thinking, as well as disorientation. It is important to detect Alzheimer's disease in early stages so that cognitive functioning would be improved by medication and training. In this paper, we propose two attention model networks for detecting Alzheimer's disease from MRI images to help early detection efforts at the preclinical stage. We also compare the performance of these two attention network models with a baseline model. Recently available OASIS-3 Longitudinal Neuroimaging, Clinical, and Cognitive Dataset is used to train, evaluate and compare our models. The novelty of this research resides in the fact that we aim to detect Alzheimer's disease when all the parameters, physical assessments, and clinical data state that the patient is healthy and showing no symptoms.


#2 An End-to-End Solution for Named Entity Recognition in eCommerce Search [PDF1] [Copy] [Kimi] [REL]

Authors: Xiang Cheng, Mitchell Bowden, Bhushan Ramesh Bhange, Priyanka Goyal, Thomas Packer, Faizan Javed

Named entity recognition (NER) is a critical step in modern search query understanding. In the domain of eCommerce, identifying the key entities, such as brand and product type, can help a search engine retrieve relevant products and therefore offer an engaging shopping experience. Recent research shows promising results on shared benchmark NER tasks using deep learning methods, but there are still unique challenges in the industry regarding domain knowledge, training data, and model production. This paper demonstrates an end-to-end solution to address these challenges. The core of our solution is a novel model training framework ”TripleLearn” which iteratively learns from three separate training datasets, instead of one training set as is traditionally done. Using this approach, the best model lifts the F1 score from 69.5 to 93.3 on the holdout test data. In our offline experiments, TripleLearn improved the model performance compared to traditional training approaches which use a single set of training data. Moreover, in the online A/B test, we see significant improvements in user engagement and revenue conversion. The model has been live on homedepot.com for more than 9 months, boosting search conversions and revenue. Beyond our application, this TripleLearn framework, as well as the end-to-end process, is model-independent and problem-independent, so it can be generalized to more industrial applications, especially to the eCommerce industry which has similar data foundations and problems.


#3 Automated Reasoning and Learning for Automated Payroll Management [PDF] [Copy] [Kimi] [REL]

Authors: Sebastijan Dumancic, Wannes Meert, Stijn Goethals, Tim Stuyckens, Jelle Huygen, Koen Denies

While payroll management is a crucial aspect of any business venture, anticipating the future financial impact of changes to the payroll policy is a challenging task due to the complexity of tax legislature. The goal of this work is to automatically explore potential payroll policies and find the optimal set of policies that satisfies the user's needs. To achieve this goal, we overcome two major challenges. First, we translate the tax legislative knowledge into a formal representation flexible enough to support a variety of scenarios in payroll calculations. Second, the legal knowledge is further compiled into a set of constraints from which a constraint solver can find the optimal policy. Furthermore, payroll computation is performed on the individual basis which might be expensive for companies with a large number of employees. To make the optimisation more efficient, we integrate it with a machine learning model that learns from the previous optimisation runs and speeds up the optimisation engine. The results of this work have been deployed by a social insurance fund.


#4 Comparison Lift: Bandit-based Experimentation System for Online Advertising [PDF] [Copy] [Kimi] [REL]

Authors: Tong Geng, Xiliang Lin, Harikesh S. Nair, Jun Hao, Bin Xiang, Shurui Fan

Comparison Lift is an experimentation-as-a-service (EaaS) application for testing online advertising audiences and creatives at JD.com. Unlike many other EaaS tools that focus primarily on fixed sample A/B testing, Comparison Lift deploys a custom bandit-based experimentation algorithm. The advantages of the bandit-based approach are two-fold. First, it aligns the randomization induced in the test with the advertiser’s goals from testing. Second, by adapting experimental design to information acquired during the test, it reduces substantially the cost of experimentation to the advertiser. Since launch in May 2019, Comparison Lift has been utilized in over 1,500 experiments. We estimate that utilization of the product has helped increase click-through rates of participating advertising campaigns by 46% on average. We estimate that the adaptive design in the product has generated 27% more clicks on average during testing compared to a fixed sample A/B design. Both suggest significant value generation and cost savings to advertisers from the product.


#5 Accurate and Interpretable Machine Learning for Transparent Pricing of Health Insurance Plans [PDF] [Copy] [Kimi] [REL]

Authors: Rohun Kshirsagar, Li-Yen Hsu, Charles H. Greenberg, Matthew McClelland, Anushadevi Mohan, Wideet Shende, Nicolas P. Tilmans, Min Guo, Ankit Chheda, Meredith Trotter, Shonket Ray, Miguel Alvarado

Health insurance companies cover half of the United States population through commercial employer-sponsored health plans and pay 1.2 trillion US dollars every year to cover medical expenses for their members. The actuary and underwriter roles at a health insurance company serve to assess which risks to take on and how to price those risks to ensure profitability of the organization. While Bayesian hierarchical models are the current standard in the industry to estimate risk, interest in machine learning as a way to improve upon these existing methods is increasing. Lumiata, a healthcare analytics company, ran a study with a large health insurance company in the United States. We evaluated the ability of machine learning models to predict the per member per month cost of employer groups in their next renewal period, especially those groups who will cost less than 95\% of what an actuarial model predicts (groups with "concession opportunities"). We developed a sequence of two models, an individual patient-level and an employer-group-level model, to predict the annual per member per month allowed amount for employer groups, based on a population of 14 million patients. Our models performed 20\% better than the insurance carrier's existing pricing model, and identified 84\% of the concession opportunities. This study demonstrates the application of a machine learning system to compute an accurate and fair price for health insurance products and analyzes how explainable machine learning models can exceed actuarial models' predictive accuracy while maintaining interpretability.


#6 Robust PDF Document Conversion using Recurrent Neural Networks [PDF] [Copy] [Kimi] [REL]

Authors: Nikolaos Livathinos, Cesar Berrospi, Maksym Lysak, Viktor Kuropiatnyk, Ahmed Nassar, Andre Carvalho, Michele Dolfi, Christoph Auer, Kasper Dinkla, Peter Staar

The number of published PDF documents in both the academic and commercial world has increased exponentially in recent decades. There is a growing need to make their rich content discoverable to information retrieval tools. Achieving high-quality semantic searches demands that a document's structural components such as title, section headers, paragraphs, (nested) lists, tables and figures (including their captions) are properly identified. Unfortunately, the PDF format is known to not conserve such structural information because it simply represents a document as a stream of low-level printing commands, in which one or more characters are placed in a bounding box with a particular styling. In this paper, we present a novel approach to document structure recovery in PDF using recurrent neural networks to process the low-level PDF data representation directly, instead of relying on a visual re-interpretation of the rendered PDF page, as has been proposed in previous literature. We demonstrate how a sequence of PDF printing commands can be used as input into a neural network and how the network can learn to classify each printing command according to its structural function in the page. This approach has three advantages: First, it can distinguish among more fine-grained labels (typically 10-20 labels as opposed to 1-5 with visual methods), which results in a more accurate and detailed document structure resolution. Second, it can take into account the text flow across pages more naturally compared to visual methods because it can concatenate the printing commands of sequential pages. Last, our proposed method needs less memory and it is computationally less expensive than visual methods. This allows us to deploy such models in production environments at a much lower cost. Through extensive architectural search in combination with advanced feature engineering, we were able to implement a model that yields a weighted average F1 score of 97% across 17 distinct structural labels. The best model we achieved is currently served in production environments on our Corpus Conversion Service (CCS), which was presented at KDD18. This model enhances the capabilities of CCS significantly, as it eliminates the need for human annotated label ground-truth for every unseen document layout. This proved particularly useful when applied to a huge corpus of PDF articles related to COVID-19.


#7 Author Homepage Discovery in CiteSeerX [PDF] [Copy] [Kimi] [REL]

Authors: Krutarth Patel, Cornelia Caragea, Doina Caragea, C. Lee Giles

Scholarly digital libraries provide access to scientific publications and comprise useful resources for researchers. CiteSeerX is one such digital library search engine that provides access to more than 10 million academic documents. We propose a novel search-driven approach to build and maintain a large collection of homepages that can be used as seed URLs in any digital library including CiteSeerX to crawl scientific documents. Precisely, we integrate Web search and classification in a unified approach to discover new homepages: first, we use publicly-available author names and research paper titles as queries to a Web search engine to find relevant content, and then we identify the correct homepages from the search results using a powerful deep learning classifier based on Convolutional Neural Networks. Moreover, we use Self-Training in order to reduce the labeling effort and to utilize the unlabeled data to train the efficient researcher homepage classifier. Our experiments on a large scale dataset highlight the effectiveness of our approach, and position Web search as an effective method for acquiring authors' homepages. We show the development and deployment of the proposed approach in CiteSeerX and the maintenance requirements.


#8 EeLISA: Combating Global Warming Through the Rapid Analysis of Eelgrass Wasting Disease [PDF] [Copy] [Kimi] [REL]

Authors: Brendan H. Rappazzo, Morgan E. Eisenlord, Olivia J. Graham, Lillian R. Aoki, Phoebe D. Dawkins, Drew Harvell, Carla Gomes

Global warming is the greatest threat facing our planet, and is causing environmental disturbance at an unprecedented scale. We are strongly positioned to leverage the advancements of Artificial Intelligence (AI) and Machine Learning (ML) which provide humanity, for the first time in history, an analysis and decision making tool at massive scale. Strong evidence supports that global warming is contributing to marine ecosystem decline, including eelgrass habitat. Eelgrass is affected by an opportunistic marine pathogen and infections are likely exacerbated by rising ocean temperatures. The necessary disease analysis required to inform conservation priorities is incredibly laborious, and acts as a significant bottleneck for research. To this end, we developed EeLISA (Eelgrass Lesion Image Segmentation Application). EeLISA enables ecologist experts to train a segmentation module to perform this crucial analysis at human level accuracy, while minimizing their labeling time and integrating into their existing workflow. EeLISA has been deployed for over 16 months, and has facilitated the preparation of four manuscripts including a critical eelgrass study ranging from Southern California to Alaska. These studies, utilizing EeLISA, have led to scientific insight and discovery in marine disease ecology.


#9 Deeplite NeutrinoTM: A BlackBox Framework for Constrained Deep Learning Model Optimization [PDF] [Copy] [Kimi] [REL]

Authors: Anush Sankaran, Olivier Mastropietro, Ehsan Saboori, Yasser Idris, Davis Sawyer, MohammadHossein AskariHemmat, Ghouthi Boukli Hacene

Designing deep learning-based solutions is becoming a race for training deeper models with a greater number of layers. While a large-size deeper model could provide competitive accuracy, it creates a lot of logistical challenges and unreasonable resource requirements during development and deployment. This has been one of the key reasons for deep learning models not being excessively used in various production environments, especially in edge devices. There is an immediate requirement for optimizing and compressing these deep learning models, to enable on-device intelligence. In this research, we introduce a black-box framework, Deeplite Neutrino^{TM} for production-ready optimization of deep learning models. The framework provides an easy mechanism for the end-users to provide constraints such as a tolerable drop in accuracy or target size of the optimized models, to guide the whole optimization process. The framework is easy to include in an existing production pipeline and is available as a Python Package, supporting PyTorch and Tensorflow libraries. The optimization performance of the framework is shown across multiple benchmark datasets and popular deep learning models. Further, the framework is currently used in production and the results and testimonials from several clients are summarized.


#10 Using Unsupervised Learning for Data-driven Procurement Demand Aggregation [PDF] [Copy] [Kimi] [REL]

Authors: Eran Shaham, Adam Westerski, Rajaraman Kanagasabai, Amudha Narayanan, Samuel Ong, Jiayu Wong, Manjeet Singh

Procurement is an essential operation of every organization regardless of its size or domain. As such, aggregating the demands could lead to better value-for-money due to: (1) lower bulk prices; (2) larger vendor tendering; (3) lower shipping and handling fees; and (4) reduced legal and administration overheads. This paper describes our experience in developing an AI solution for demand aggregation and deploying it in A*STAR, a large governmental research organization in Singapore with procurement expenditure to the scale of hundreds of millions of dollars annually. We formulate the demand aggregation problem using a bipartite graph model depicting the relationship between procured items and target vendors, and show that identifying maximal edge bicliques within that graph would reveal potential demand aggregation patterns. We propose an unsupervised learning methodology for efficiently mining such bicliques using a novel Monte Carlo subspace clustering approach. Based on this, a proof-of-concept prototype was developed and tested with the end users during 2017, and later trialed and iteratively refined, before being rolled out in 2019. The final performance was 71% of past cases transformed into bulk tenders correctly detected by the engine; for new opportunities pointed out by the engine 81% were deemed useful for potential bulk tender contracts in the future. Additionally, per each valid pattern identified, the engine achieved 100% precision (all aggregated purchase orders were correct), and 79% recall (the engine correctly identified 79% of orders that should have been put into the bulk tenders). Overall, the cost savings from the true positive contracts spotted so far are estimated to be S$7 million annually.


#11 Tool for Automated Tax Coding of Invoices [PDF] [Copy] [Kimi] [REL]

Authors: Tarun Tater, Sampath Dechu, Neelamadhav Gantayat, Meena Guptha, Sivakumar Narayanan

Accounts payable refer to the practice where organizations procure goods and services on credit which need to be reimbursed to the vendors in due time. Once the vendor raises an invoice, it undergoes through a complex process before the final payment. In this process, tax code determination is one of the most challenging steps, which determines the tax to be levied and directly influences the amount payable to a vendor. This step is also very important from a regulatory compliance standpoint. However, it is error-prone, labor (resource) intensive, and needs regular training of the resources as it is done manually. Further, an error in the tax code determination induces penalties on the organization. Automatically arriving at a tax-code for a given product accurately and efficiently is a daunting task. To address this problem, we present an automated end-to-end system for tax code determination which can either be used as a standalone application or can be integrated into an existing invoice processing workflow. The proposed system determines the most relevant tax code for an invoice using attributes such as item description, vendor details, shipping and delivery location. The system has been deployed in production for a multinational consumer goods company for more than 6 months. It has already processed more than 22k items with an accuracy of more than 94% and high confidence prediction accuracy of around 99.54%. Using this system, approximately 73% of all the invoices require no human intervention.


#12 An Automated Engineering Assistant: Learning Parsers for Technical Drawings [PDF] [Copy] [Kimi] [REL]

Authors: Dries Van Daele, Nicholas Decleyre, Herman Dubois, Wannes Meert

Manufacturing companies rely on technical drawings to develop new designs or adapt designs to customer preferences. The database of historical and novel technical drawings thus represents the knowledge that is core to their operations. With current methods, however, utilizing these drawings is mostly a manual and time consuming effort. In this work, we present a software tool that knows how to interpret various parts of the drawing and can translate this information to allow for automatic reasoning and machine learning on top of such a large database of technical drawings. For example, to find erroneous designs, to learn about patterns present in successful designs, etc. To achieve this, we propose a method that automatically learns a parser capable of interpreting technical drawings, using only limited expert interaction. The proposed method makes use of both neural methods and symbolic methods. Neural methods to interpret visual images and recognize parts of two-dimensional drawings. Symbolic methods to deal with the relational structure and understand the data encapsulated in complex tables present in the technical drawing. Furthermore, the output can be used, for example, to build a similarity based search algorithm. We showcase one deployed tool that is used to help engineers find relevant, previous designs more easily as they can now query the database using a partial design instead of through limited and tedious keyword searches. A partial design can be a part of the two-dimensional drawing, part of a table, part of the contained textual information, or combinations thereof.


#13 Mars Image Content Classification: Three Years of NASA Deployment and Recent Advances [PDF] [Copy] [Kimi] [REL]

Authors: Kiri Wagstaff, Steven Lu, Emily Dunkel, Kevin Grimes, Brandon Zhao, Jesse Cai, Shoshanna B. Cole, Gary Doran, Raymond Francis, Jake Lee, Lukas Mandrake

The NASA Planetary Data System hosts millions of images acquired from the planet Mars. To help users quickly find images of interest, we have developed and deployed content-based classification and search capabilities for Mars orbital and surface images. The deployed systems are publicly accessible using the PDS Image Atlas. We describe the process of training, evaluating, calibrating, and deploying updates to two CNN classifiers for images collected by Mars missions. We also report on three years of deployment including usage statistics, lessons learned, and plans for the future.


#14 Enhancing E-commerce Recommender System Adaptability with Online Deep Controllable Learning-To-Rank [PDF] [Copy] [Kimi] [REL]

Authors: Anxiang Zeng, Han Yu, Hualin He, Yabo Ni, Yongliang Li, Jingren Zhou, Chunyan Miao

In the past decade, recommender systems for e-commerce have witnessed significant advancement. Recommendation scenarios can be divided into different type (e.g., pre-, during-, post-purchase, campaign, promotion, bundle) for different user groups or different businesses. For different scenarios, the goals of recommendation are different. This is reflected by the different performance metrics employed. In addition, online promotional campaigns, which attract high traffic volumes, are also a critical factor affecting e-commerce recommender systems. Typically, prior to a promotional campaign, the Add-to-Cart Rate (ACR) is the target of optimization. During the campaign, this changes to Gross Merchandise Volumes (GMV). Immediately after the campaign, it becomes Click Through Rates CTR. Dynamically adapting among these potentially conflicting optimization objectives is an important capability for recommender systems deployed in real-world e-commerce platforms. In this paper, we report our experience designing and deploying the Deep Controllable Learning-To-Rank (DC-LTR) recommender system to address this challenge. It enhances the feedback controller in LTR with multi-objective optimization so as to maximize different objectives under constraints. Its ability to dynamically adapt to changing business objectives has resulted in significant business advantages. Since September 2019, DC-LTR has become a core service enabling adaptive online training and real-time deployment ranking models based on changing business objectives in AliExpress and Lazada. Under both everyday use scenarios and peak load scenarios during large promotional campaigns, DC-LTR has achieved significant improvements in satisfying real-world business objectives.


#15 A Novel AI-based Methodology for Identifying Cyber Attacks in Honey Pots [PDF] [Copy] [Kimi] [REL]

Authors: Muhammed AbuOdeh, Christian Adkins, Omid Setayeshfar, Prashant Doshi, Kyu H. Lee

We present a novel AI-based methodology that identifies phases of a host-level cyber attack simply from system call logs. System calls emanating from cyber attacks on hosts such as honey pots are often recorded in audit logs. Our methodology first involves efficiently loading, caching, processing, and querying system events contained in audit logs in support of computer forensics. Output of queries remains at the system call level and is difficult to process. The next step is to infer a sequence of abstracted actions, which we colloquially call a storyline, from the system calls given as observations to a latent-state probabilistic model. These storylines are then accurately identified with class labels using a learned classifier. We qualitatively and quantitatively evaluate methods and models for each step of the methodology using 114 different attack phases collected by logging the attacks of a red team on a server, on some likely benign sequences containing regular user activities, and on traces from a recent DARPA project. The resulting end-to-end system, which we call Cyberian, identifies the attack phases with a high level of accuracy illustrating the benefit that this machine learning-based methodology brings to security forensics.


#16 Finding Needles in Heterogeneous Haystacks [PDF] [Copy] [Kimi] [REL]

Authors: Bijaya Adhikari, Liangyue Li, Nikhil Rao, Karthik Subbian

Due to intense competition and lack of real estate on the front page of large e-commerce platforms, sellers are sometimes motivated to garner non-genuine signals (clicks, add-to-carts, purchases) on their products, to make them appear more appealing to customers. This hurts customers' trust on the platform, and also hurts genuine sellers who sell their items without looking to game the system. While it is important to find the sellers and the buyers who are colluding to garner these non-genuine signals, doing so is highly nontrivial. Firstly, the set of bad actors in the system is a very small fraction of all the buyers/sellers on the platform. Secondly, bad actors ``hide" with the good ones, making them hard to detect. In this paper, we develop CONGCN, a context aware heterogeneous graph convolutional network to detect bad actors on a large heterogeneous graph. While our method is motivated by abuse detection in e-commerce, the method is applicable to other areas such as computational biology and finance, where large heterogeneous graphs are pervasive, and the amount of labeled data is very limited. We train CONGCN via novel sampling methods, and context aware message passing in a semi-supervised fashion to predict dishonest buyers and sellers in e-commerce. Extensive experiments show that our method is effective, beating several baselines; generalizable to an inductive setting and highly scalable


#17 Path to Automating Ocean Health Monitoring [PDF] [Copy] [Kimi] [REL]

Authors: Mak Ahmad, J. Scott Penberthy, Abigail Powell

Marine ecosystems directly and indirectly impact human health, providing benefits such as essential food sources, coastal protection and biomedical compounds. Monitoring changes in marine species is important because impacts such as overfishing, ocean acidification and hypoxic zones can negatively affect both human and ocean health. The US west coast supports a diverse assemblage of deep-sea corals that provide habitats for fish and numerous other invertebrates. Currently, National Oceanic Atmospheric Administration (NOAA) scientists manually track the health of coral species using extractive methods. In this paper, we test the viability of using a machine learning algorithm Convolutional Neural Network (CNN) to automatically classify coral species, using field-collected coral images in collaboration with NOAA. We fine tune the hyperparameters of our model to surpass the human F-score. We also highlight a scalable opportunity to monitor ocean health automatically while preserving corals.


#18 Ontology-Enriched Query Answering on Relational Databases [PDF] [Copy] [Kimi] [REL]

Authors: Shqiponja Ahmetaj, Vasilis Efthymiou, Ronald Fagin, Phokion G. Kolaitis, Chuan Lei, Fatma Özcan, Lucian Popa

We develop a flexible, open-source framework for query answering on relational databases by adopting methods and techniques from the Semantic Web community and the data exchange community, and we apply this framework to a medical use case. We first deploy module-extraction techniques to derive a concise and relevant sub-ontology from an external reference ontology. We then use the chase procedure from the data exchange community to materialize a universal solution that can be subsequently used to answer queries on an enterprise medical database. Along the way, we identify a new class of well-behaved acyclic EL-ontologies extended with role hierarchies, suitably restricted functional roles, and domain/range restrictions, which cover our use case. We show that such ontologies are C-stratified, which implies that the chase procedure terminates in polynomial time. We provide a detailed overview of our real-life application in the medical domain and demonstrate the benefits of this approach, such as discovering additional answers and formulating new queries.


#19 Attr2Style: A Transfer Learning Approach for Inferring Fashion Styles via Apparel Attributes [PDF] [Copy] [Kimi] [REL]

Authors: Rajdeep H Banerjee, Abhinav Ravi, Ujjal Kr Dutta

Popular fashion e-commerce platforms mostly provide details about low-level attributes of an apparel (for example, neck type, dress length, collar type, print etc) on their product detail pages. However, customers usually prefer to buy apparel based on their style information, or simply put, occasion (for example, party wear, sports wear, casual wear etc). Application of a supervised image-captioning model to generate style-based image captions is limited because obtaining ground-truth annotations in the form of style-based captions is difficult. This is because annotating style-based captions requires a certain amount of fashion domain expertise, and also adds to the costs and manual effort. On the contrary, low-level attribute based annotations are much more easily available. To address this issue, we propose a transfer-learning based image captioning model that is trained on a source dataset with sufficient attribute-based ground-truth captions, and used to predict style-based captions on a target dataset. The target dataset has only a limited amount of images with style-based ground-truth captions. The main motivation of our approach comes from the fact that most often there are correlations among the low-level attributes and the higher-level styles for an apparel. We leverage this fact and train our model in an encoder-decoder based framework using attention mechanism. In particular, the encoder of the model is first trained on the source dataset to obtain latent representations capturing the low-level attributes. The trained model is fine-tuned to generate style-based captions for the target dataset. To highlight the effectiveness of our method, we qualitatively and quantitatively demonstrate that the captions generated by our approach are close to the actual style information for the evaluated apparel. A Proof Of Concept (POC) for our model is under pilot at Myntra (www.myntra.com) where it is exposed to some internal users for feedback.


#20 Topological Machine Learning Methods for Power System Responses to Contingencies [PDF] [Copy] [Kimi] [REL]

Authors: Brian Bush, Yuzhou Chen, Dorcas Ofori-Boateng, Yulia R. Gel

While deep learning tools, coupled with the emerging machinery of topological data analysis, are proven to deliver various performance gains in a broad range of applications, from image classification to biosurveillance to blockchain fraud detection, their utility in areas of high societal importance such as power system modeling and, particularly, resilience quantification in the energy sector yet remains untapped. To provide fast acting synthetic regulation and contingency reserve services to the grid while having minimal disruptions on customer quality of service, we propose a new topology-based system that depends on a neural network architecture for impact metric classification and prediction in power systems. This novel topology-based system allows one to evaluate the impact of three power system contingency types, in conjunction with transmission lines, transformers, and transmission lines combined with transformers. We show that the proposed new neural network architecture equipped with local topological measures facilitates more accurate classification of unserved load as well as the amount of unserved load. In addition, we are able to learn more about the complex relationships between electrical properties and local topological measurements on their simulated response to contingencies for the NREL-SIIP power system.


#21 Data-Driven Multimodal Patrol Planning for Anti-poaching [PDF] [Copy] [Kimi] [REL]

Authors: Weizhe Chen, Weinan Zhang, Duo Liu, Weiping Li, Xiaojun Shi, Fei Fang

Wildlife poaching is threatening key species that play important roles in the ecosystem. With historical ranger patrol records, it is possible to provide data-driven predictions of poaching threats and plan patrols to combat poaching. However, the patrollers often patrol in a multimodal way, which combines driving and walking. It is a tedious task for the domain experts to manually plan such a patrol and as a result, the planned patrol routes are often far from optimal. In this paper, we propose a data-driven approach for multimodal patrol planning. We first use machine learning models to predict the poaching threats and then use a novel mixed-integer linear programming-based algorithm to plan the patrol route. In a field test focusing on the machine learning prediction result at Jilin Huangnihe National Nature Reserve (HNHR) in December 2019, the rangers found 42 snares, which is significantly higher than the historical record. Our offline experiments show that the resulting multimodal patrol routes can improve the efficiency of patrol and thus they can serve as the basis for future deployment in the field.


#22 Deepening the Sense of Touch in Planetary Exploration with Geometric and Topological Deep Learning [PDF] [Copy] [Kimi] [REL]

Authors: Yuzhou Chen, Yuliya Marchetti, Yulia R. Gel

Tactile and embedded sensing is a new concept that has recently appeared in the context of rovers and planetary exploration missions. Various sensors such as those measuring pressure and integrated directly on wheels have the potential to add a "sense of touch" to exploratory vehicles. We investigate the utility of deep learning (DL), from conventional Convolutional Neural Networks (CNN) to emerging geometric and topological DL, to terrain classification for planetary exploration based on a novel dataset from an experimental tactile wheel concept. The dataset includes 2D conductivity images from a pressure sensor array, which is wrapped around a rover wheel and is able to read pressure signatures of the ground beneath the wheel. Neither newer nor traditional DL tools have been previously applied to tactile sensing data. We discuss insights into advantages and limitations of these methods for the analysis of non-traditional pressure images and their potential use in planetary surface science.


#23 Identification of Abnormal States in Videos of Ants Undergoing Social Phase Change [PDF] [Copy] [Kimi] [REL]

Authors: Taeyeong Choi, Benjamin Pyenson, Juergen Liebig, Theodore P. Pavlic

Biology is both an important application area and a source of motivation for development of advanced machine learning techniques. Although much attention has been paid to large and complex data sets resulting from high-throughput sequencing, advances in high-quality video recording technology have begun to generate similarly rich data sets requiring sophisticated techniques from both computer vision and time-series analysis. Moreover, just as studying gene expression patterns in one organism can reveal general principles that apply to other organisms, the study of complex social interactions in an experimentally tractable model system, such as a laboratory ant colony, can provide general principles about the dynamics of other social groups. Here, we focus on one such example from the study of reproductive regulation in small laboratory colonies of more than 50 Harpegnathos ants. These ants can be artificially induced to begin a ~20 day process of hierarchy reformation. Although the conclusion of this process is conspicuous to a human observer, it remains unclear which behaviors during the transient period are contributing to the process. To address this issue, we explore the potential application of One-class Classification (OC) to the detection of abnormal states in ant colonies for which behavioral data is only available for the normal societal conditions during training. Specifically, we build upon the Deep Support Vector Data Description (DSVDD) and introduce the Inner-Outlier Generator (IO-GEN) that synthesizes fake “inner outlier” observations during training that are near the center of the DSVDD data description.We show that IO-GEN increases the reliability of the final OC classifier relative to other DSVDD baselines. This method can be used to screen video frames for which additional human observation is needed. Although we focus on an application with laboratory colonies of social insects, this approach may be applied to video data from other social systems to either better understand the causal factors behind social phase transitions or even to predict the onset of future transitions.


#24 Shape-based Feature Engineering for Solar Flare Prediction [PDF] [Copy] [Kimi] [REL]

Authors: Varad Deshmukh, Thomas Berger, James Meiss, Elizabeth Bradley

Solar flares are caused by magnetic eruptions in active regions (ARs) on the surface of the sun. These events can have significant impacts on human activity, many of which can be mitigated with enough advance warning from good forecasts. To date, machine learning-based flare-prediction methods have employed physics-based attributes of the AR images as features; more recently, there has been some work that uses features deduced automatically by deep learning methods (such as convolutional neural networks). We describe a suite of novel shape-based features extracted from magnetogram images of the Sun using the tools of computational topology and computational geometry. We evaluate these features in the context of a multi-layer perceptron (MLP) neural network and compare their performance against the traditional physics-based attributes. We show that these abstract shape-based features outperform the features chosen by the human experts, and that a combination of the two feature sets improves the forecasting capability even further.


#25 JEL: Applying End-to-End Neural Entity Linking in JPMorgan Chase [PDF] [Copy] [Kimi] [REL]

Authors: Wanying Ding, Vinay K. Chaudhri, Naren Chittar, Krihshna Konakanchi

Knowledge Graphs have emerged as a compelling abstraction for capturing key relationship among the entities of interest to enterprises and for integrating data from heterogeneous sources. JPMorgan Chase (JPMC) is leading this trend by leveraging knowledge graphs across the organization for multiple mission critical applications such as risk assessment, fraud detection, investment advice, etc. A core problem in leveraging a knowledge graph is to link mentions (e.g., company names) that are encountered in textual sources to entities in the knowledge graph. Although several techniques exist for entity linking, they are tuned for entities that exist in Wikipedia, and fail to generalize for the entities that are of interest to an enterprise. In this paper, we propose a novel end-to-end neural entity linking model (JEL) that uses minimal context information and a margin loss to generate entity embeddings, and a Wide & Deep Learning model to match character and semantic information respectively. We show that JEL achieves the state-of-the-art performance to link mentions of company names in financial news with entities in our knowledge graph. We report on our efforts to deploy this model in the company-wide system to generate alerts in response to financial news. The methodology used for JEL is directly applicable and usable by other enterprises who need entity linking solutions for data that are unique to their respective situations.