| Total: 39
With the rise of Intelligent Virtual Assistants (IVAs), there is a necessary rise in human effort to identify conversations containing misunderstood user inputs. These conversations uncover error in natural language understanding and help prioritize and expedite improvements to the IVA. As human reviewer time is valuable and manual analysis is time consuming, prioritizing the conversations where misunderstanding has likely occurred reduces costs and speeds improvement. In addition, less conversations reviewed by humans mean less user data is exposed, increasing privacy. We present a scalable system for automated conversation review that can identify potential miscommunications. Our system provides IVA designers with suggested actions to fix errors in IVA understanding, prioritizes areas of language model repair, and automates the review of conversations where desired. Verint - Next IT builds IVAs on behalf of other companies and organizations, and therefore analyzes large volumes of conversational data. Our review system has been in production for over three years and saves our company roughly $1.5 million in annotation costs yearly, as well as shortened the refinement cycle of production IVAs. In this paper, the system design is discussed and performance in identifying errors in IVA understanding is compared to that of human reviewers.
We present a commercially deployed machine learning system that automates the day-ahead nomination of the expected grid loss for a Norwegian utility company. It meets several practical constraints and issues related to, among other things, delayed, missing and incorrect data and a small data set. The system incorporates a total of 24 different models that performs forecasts for three sub-grids. Each day one model is selected for making the hourly day-ahead forecasts for each sub-grid. The deployed system reduces the MAE with 41% from 3.68 MW to 2.17 MW per hour from mid July to mid October. It is robust and reduces manual work.
Stickers are popularly used in messaging apps such as Hike to visually express a nuanced range of thoughts and utterances to convey exaggerated emotions. However, discovering the right sticker from a large and ever expanding pool of stickers while chatting can be cumbersome. In this paper, we describe a system for recommending stickers in real time as the user is typing based on the context of the conversation. We decompose the sticker recommendation (SR) problem into two steps. First, we predict the message that the user is likely to send in the chat. Second, we substitute the predicted message with an appropriate sticker. Majority of Hike's messages are in the form of text which is transliterated from users' native language to the Roman script. This leads to numerous orthographic variations of the same message and makes accurate message prediction challenging. To address this issue, we learn dense representations of chat messages employing character level convolution network in an unsupervised manner. We use them to cluster the messages that have the same meaning. In the subsequent steps, we predict the message cluster instead of the message. Our approach does not depend on human labelled data (except for validation), leading to fully automatic updation and tuning pipeline for the underlying models. We also propose a novel hybrid message prediction model, which can run with low latency on low-end phones that have severe computational limitations. Our described system has been deployed for more than 6 months and is being used by millions of users along with hundreds of thousands of expressive stickers.
In collaboration with Frontec, which produces parts such as bolts and nuts for the automobile industry, Kyung Hee University and Benple Inc. develop and deploy AI system for automatic quality inspection of weld nuts. Various constraints to consider exist in adopting AI for the factory, such as response time and limited computing resources available. Our convolutional neural network (CNN) system using large-scale images must classify weld nuts within 0.2 seconds with accuracy over 95%. We designed Circular Hough Transform based preprocessing and an adjusted VGG (Visual Geometry Group) model. The system showed accuracy over 99% and response time of about 0.14 sec. We use TCP / IP protocol to communicate the embedded classification system with an existing vision inspector using LabVIEW. We suggest ways to develop and embed a deep learning framework in an existing manufacturing environment without a hardware change.
Visual object detection is a computer vision-based artificial intelligence (AI) technique which has many practical applications (e.g., fire hazard monitoring). However, due to privacy concerns and the high cost of transmitting video data, it is highly challenging to build object detection models on centrally stored large training datasets following the current approach. Federated learning (FL) is a promising approach to resolve this challenge. Nevertheless, there currently lacks an easy to use tool to enable computer vision application developers who are not experts in federated learning to conveniently leverage this technology and apply it in their systems. In this paper, we report FedVision - a machine learning engineering platform to support the development of federated learning powered computer vision applications. The platform has been deployed through a collaboration between WeBank and Extreme Vision to help customers develop computer vision-based safety monitoring solutions in smart city applications. Over four months of usage, it has achieved significant efficiency improvement and cost reduction while removing the need to transmit sensitive data for three major corporate customers. To the best of our knowledge, this is the first real application of FL in computer vision-based tasks.
Today, most of the large-scale conversational AI agents such as Alexa, Siri, or Google Assistant are built using manually annotated data to train the different components of the system including Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Entity Resolution (ER). Typically, the accuracy of the machine learning models in these components are improved by manually transcribing and annotating data. As the scope of these systems increase to cover more scenarios and domains, manual annotation to improve the accuracy of these components becomes prohibitively costly and time consuming. In this paper, we propose a system that leverages customer/system interaction feedback signals to automate learning without any manual annotation. Users of these systems tend to modify a previous query in hopes of fixing an error in the previous turn to get the right results. These reformulations, which are often preceded by defective experiences caused by either errors in ASR, NLU, ER or the application. In some cases, users may not properly formulate their requests (e.g. providing partial title of a song), but gleaning across a wider pool of users and sessions reveals the underlying recurrent patterns. Our proposed self-learning system automatically detects the errors, generate reformulations and deploys fixes to the runtime system to correct different types of errors occurring in different components of the system. In particular, we propose leveraging an absorbing Markov Chain model as a collaborative filtering mechanism in a novel attempt to mine these patterns. We show that our approach is highly scalable, and able to learn reformulations that reduce Alexa-user errors by pooling anonymized data across millions of customers. The proposed self-learning system achieves a win-loss ratio of 11.8 and effectively reduces the defect rate by more than 30% on utterance level reformulations in our production A/B tests. To the best of our knowledge, this is the first self-learning large-scale conversational AI system in production.
The U.S. Navy is successfully using natural language processing (NLP) and common machine-learning (ML) algorithms to categorize and automatically route plain text support requests at a Navy fleet support center. The algorithms enhance routine IT support tasks with automation and reduce the workload of service desk agents. The ML pipeline works in a five-step process. First, an archive of documents is created from various sources, including standard operating procedure (SOP) memos, frequently asked questions (FAQs), knowledge articles, Wikipedia articles, encyclopedia articles, previously closed support requests, and other relevant documents. Next, a library of words and phrases is generated from the archive. Then, this library is used to vectorize an incoming support request, producing a term frequency inverse document frequency (TF-IDF) vector. Following, the TF-IDF vector is used to compute similarity scores between the support request and the documents in the previously-created archive. Finally, the similarity scores are processed by support vector machine (SVM) classifiers to categorize and route the incoming support request to the correct support provider. This algorithm was deployed at a U.S. Navy customer support center as part of a pilot study, where it decreased the amount of time agents spend on tickets by 35%; the amount of time required to assign tickets by 74%; and the amount of time to close tickets by 60%. Our internal tests show that, with an error rate of 2%, a 35% reduction in ticket volume could be achieved by fully deploying these algorithms.
Technical support domain involves solving problems from user queries through various channels: voice, web and chat, and is both time-consuming and labour intensive. The textual queries in web or chat mode are unstructured and often incomplete. This affects information retrieval and increases the difficulty level for agents to solve it. Such cases require multiple rounds of interaction between user and agent/chatbot in order to better understand the user query. This paper presents a deployed system called Question Quality Improvement (QQI), that aims to improve the quality of user utterance by understanding and extracting important parts of an utterance and gamifying the user interface, prompting them to enter the remaining relevant information. QQI is guided by an ontology designed for the technical support domain and uses co-reference resolution and deep parsing to understand the sentences. Using the syntactics and semantics in the deep parse tree structure various attributes in the ontology are extracted. The system has been in production for over two years supporting around 800 products resulting in a reduction in the time-to-resolve cases by around 29%, leading to huge cost savings. QQI being a core natural language understanding and metadata extraction technology, directly affects more than 8K tickets everyday. These cases are submitted after 50K edits done on the case based on QQI feedback. QQI outputs are used by other technologies such as search and retrieval, case routing for automated dispatch, case-difficulty-prediction, and by the chatbots supported in each product page.
Competitive analysis is a critical part of any business. Product managers, sellers, and marketers spend time and resources scouring through an immense amount of online and offline content, aiming to discover what their competitors are doing in the marketplace to understand what type of threat they pose to their business' financial well-being. Currently, this process is time and labor-intensive, slow and costly. This paper presents Clarity, a data-driven unsupervised system for assessment of products, which is currently in deployment in the large IT company, IBM. Clarity has been running for more than a year and is used by over 1,500 people to perform over 160 competitive analyses involving over 800 products. The system considers multiple factors from a collection of online content: numeric ratings by online users, sentiments of reviews for key product performance dimensions, content volume, and recency of content. The results and explanations of factors leading to the results are visualized in an interactive dashboard that allows users to track their product's performance as well as understand main contributing factors. Its efficacy has been tested in a series of cases across IBM's portfolio which spans software, hardware, and services.
In large-scale search systems, the quality of the ranking results is continually improved with the introduction of more factors from complex procedures. Meanwhile, the increase in factors demands more computation resources and increases system response latency. It has been observed that, under some certain context a search instance may require only a small set of useful factors instead of all factors in order to return high quality results. Therefore, removing ineffective factors accordingly can significantly improve system efficiency. In this paper, we report our experience incorporating our Contextual Factor Selection (CFS) approach into the Taobao e-commerce platform to optimize the selection of factors based on the context of each search query in order to simultaneously achieve high quality search results while significantly reducing latency time. This problem is treated as a combinatorial optimization problem which can be tackled through a sequential decision-making procedure. The problem can be efficiently solved by CFS through a deep reinforcement learning method with reward shaping to address the problems of reward signal scarcity and wide reward signal distribution in real-world search engines. Through extensive off-line experiments based on data from the Taobao.com platform, CFS is shown to significantly outperform state-of-the-art approaches. Online deployment on Taobao.com demonstrated that CFS is able to reduce average search latency time by more than 40% compared to the previous approach with negligible reduction in search result quality. Under peak usage during the Single's Day Shopping Festival (November 11th) in 2017, CFS reduced peak load search latency time by 33% compared to the previous approach, helping Taobao.com achieve 40% higher revenue than the same period during 2016. Corrigendum The spelling of coauthor Yusen Zan in the paper "Accelerating Ranking in E-Commerce Search Engines through Contextual Factor Selection" has been changed from Zan to Zhan. The original spelling was a typographical error.
Electricity information tracking systems are increasingly being adopted across China. Such systems can collect real-time power consumption data from users, and provide opportunities for artificial intelligence (AI) to help power companies and authorities make optimal demand-side management decisions. In this paper, we discuss power utilization improvement in Shandong Province, China with a deployed AI application - the Power Intelligent Decision Support (PIDS) platform. Based on improved short-term power consumption gap prediction, PIDS uses an optimal power adjustment plan which enables fine-grained Demand Response (DR) and Orderly Power Utilization (OPU) recommendations to ensure stable operation while minimizing power disruptions and improving fair treatment of participating companies. Deployed in August 2018, the platform is helping over 400 companies optimize their power consumption through DR while dynamically managing the OPU process for around 10,000 companies. Compared to the previous system, power outage under PIDS through planned shutdown has been reduced from 16% to 0.56%, resulting in significant gains in economic activities.
Autonomous biofeedback tools in support of rehabilitation patients are commonly built as multi-tier pipelines, where a segmentation algorithm is first responsible for isolating motion primitives, and then classification can be performed on each primitive. In this paper, we present a novel segmentation technique that integrates on-the-fly qualitative classification of physical movements in the process. We adopt Long Short-Term Memory (LSTM) networks to model the temporal patterns of a streaming multivariate time series, obtained by sampling acceleration and angular velocity of the limb in motion, and then we aggregate the pointwise predictions of each isolated movement using different boosting methods. We tested our technique against a dataset composed of four common lower-limb rehabilitation exercises, collected from heterogeneous populations (clinical and healthy). Experimental results are promising and show that combining segmentation and classification of orthopaedic movements is a valid method with many potential real-world applications.
Mistakes made by humans, or machines, commonly arise when managing ballots cast in an election. In the 2013 Australian Federal Election, for example, 1,370 West Australian Senate ballots were lost, eventually leading to a costly re-run of the election. Other mistakes include ballots that are misrecorded by electronic voting systems, voters that cast invalid ballots, or vote multiple times at different polling locations. We present a method for assessing whether such problems could have made a difference to the outcome of a Single Transferable Vote (STV) election – a complex system of preferential voting for multi-seat elections. It is used widely in Australia, in Ireland, and in a range of local government elections in the United Kingdom and United States.
Earth and planetary sciences often rely upon the detailed examination of spectroscopic data for rock and mineral identification. This typically requires the collection of high resolution spectroscopic measurements. However, they tend to be scarce, as compared to low resolution remote spectra. This work addresses the problem of inferring high-resolution mineral spectroscopic measurements from low resolution observations using probability models. We present the Deep Gaussian Conditional Model, a neural network that performs probabilistic super resolution via maximum likelihood estimation. It also provides insight into learned correlations between measurements and spectroscopic features, allowing for the tractability and interpretability that scientists often require for mineral identification. Experiments using remote spectroscopic data demonstrate that our method compares favorably to other analogous probabilistic methods. Finally, we show and discuss how our method provides human-interpretable results, making it a compelling analysis tool for scientists.
Developing algorithms that identify potentially illegal trade shipments is a non-trivial task, exacerbated by the size of shipment data as well as the unavailability of positive training data. In collaboration with conservation organizations, we develop a framework that incorporates machine learning and domain knowledge to tackle this challenge. Modeling the task as anomaly detection, we propose a simple and effective embedding-based anomaly detection approach for categorical data that provides better performance and scalability than the current state-of-art, along with a negative sampling approach that can efficiently train the proposed model. Additionally, we show how our model aids the interpretability of results which is crucial for the task. Domain knowledge, though sparse and scattered across multiple open data sources, is ingested with input of domain experts to create rules that highlight actionable results. The application framework demonstrates the applicability of our proposed approach on real world trade data. An interface combined with the framework presents a complete system that can ingest, detect and aid in the analysis of suspicious timber trades.
In a world where autonomous driving cars are becoming increasingly more common, creating an adequate infrastructure for this new technology is essential. This includes building and labeling high-definition (HD) maps accurately and efficiently. Today, the process of creating HD maps requires a lot of human input, which takes time and is prone to errors. In this paper, we propose a novel method capable of generating labelled HD maps from raw sensor data. We implemented and tested our methods on several urban scenarios using data collected from our test vehicle. The results show that the proposed deep learning based method can produce highly accurate HD maps. This approach speeds up the process of building and labeling HD maps, which can make meaningful contribution to the deployment of autonomous vehicles.
We propose a new method for solving Initial Value Problems (IVPs). Our method is based on analog computing and has the potential to almost eliminate traditional switching time in digital computing. The approach can be used to simulate large systems longer, faster, and with higher accuracy. Many algorithms for Model-Based Diagnosis use numerical integration to simulate physical systems. The numerical integration process is often either computationally expensive or imprecise. We propose a new method, based on Field-Programmable Analog Arrays (FPAAs) that has the potential to overcome many practical problems. We envision a software/hardware framework for solving systems of simultaneous Ordinary Differential Equations (ODEs) in fraction of the time of traditional numerical algorithms. In this paper we describe the solving of an IVP with the help of an Analog Computing Unit (ACU). To do this we build a special calculus based on operational amplifiers (op-amps) with local feedback. We discuss the implementation of the ACU on an Integrated Circuit (IC). We analyze the working if the IC and simulate the dynamic Lotka-Volterra system with the de-facto standard tool for electrical simulation: Spice.
Although deep learning for Diabetic Retinopathy (DR) screening has shown great success in achieving clinically acceptable accuracy for referable versus non-referable DR, there remains a need to provide more fine-grained grading of the DR severity level as well as automated segmentation of lesions (if any) in the retina images. We observe that the DR severity level of an image is dependent on the presence of different types of lesions and their prevalence. In this work, we adopt a multi-task learning approach to perform the DR grading and lesion segmentation tasks. In light of the lack of lesion segmentation mask ground-truths, we further propose a semi-supervised learning process to obtain the segmentation masks for the various datasets. Experiments results on publicly available datasets and a real world dataset obtained from population screening demonstrate the effectiveness of the multi-task solution over state-of-the-art networks.
Firms implementing digital advertising campaigns face a complex problem in determining the right match between their advertising creatives and target audiences. Typical solutions to the problem have leveraged non-experimental methods, or used “split-testing” strategies that have not explicitly addressed the complexities induced by targeted audiences that can potentially overlap with one another. This paper presents an adaptive algorithm that addresses the problem via online experimentation. The algorithm is set up as a contextual bandit and addresses the overlap issue by partitioning the target audiences into disjoint, non-overlapping sub-populations. It learns an optimal creative display policy in the disjoint space, while assessing in parallel which creative has the best match in the space of possibly overlapping target audiences. Experiments show that the proposed method is more efficient compared to naive “split-testing” or non-adaptive “A/B/n” testing based methods. We also describe a testing product we built that uses the algorithm. The product is currently deployed on the advertising platform of JD.com, an eCommerce company and a publisher of digital ads in China.
The Electrocardiogram (ECG) is performed routinely by medical personell to identify structural, functional and electrical cardiac events. Many attempts were made to automate this task using machine learning algorithms. Numerous supervised learning algorithms were proposed, requiring manual feature extraction. Lately, deep neural networks were also proposed for this task for reaching state-of-the-art results. The ECG signal conveys the specific electrical cardiac activity of each subject thus extreme variations are observed between patients. These variations and the low amount of training data available for each arrhythmia are challenging for deep learning algorithms, and impede generalization. In this work, the use of generative adversarial networks is studied for the synthesis of ECG signals, which can then be used as additional training data to improve the classifier performance. Empirical results prove that the generated signals significantly improve ECG classification.
This paper presents a job recommender system to match resumes to job descriptions (JD), both of which are non-standard and unstructured/semi-structured in form. First, the paper proposes a combination of natural language processing (NLP) techniques for the task of skill extraction. The performance of the combined techniques on an industrial scale dataset yielded a precision and recall of 0.78 and 0.88 respectively. The paper then introduces the concept of extracting implicit skills – the skills which are not explicitly mentioned in a JD but may be implicit in the context of geography, industry or role. To mine and infer implicit skills for a JD, we find the other JDs similar to this JD. This similarity match is done in the semantic space. A Doc2Vec model is trained on 1.1 Million JDs covering several domains crawled from the web, and all the JDs are projected onto this semantic space. The skills absent in the JD but present in similar JDs are obtained, and the obtained skills are weighted using several techniques to obtain the set of final implicit skills. Finally, several similarity measures are explored to match the skills extracted from a candidate's resume to explicit and implicit skills of JDs. Empirical results for matching resumes and JDs demonstrate that the proposed approach gives a mean reciprocal rank of 0.88, an improvement of 29.4% when compared to the performance of a baseline method that uses only explicit skills.
Farmer suicides have become an urgent social problem which governments around the world are trying hard to solve. Most farmers are driven to suicide due to an inability to sell their produce at desired profit levels, which is caused by the widespread uncertainty/fluctuation in produce prices resulting from varying market conditions. To prevent farmer suicides, this paper takes the first step towards resolving the issue of produce price uncertainty by presenting PECAD, a deep learning algorithm for accurate prediction of future produce prices based on past pricing and volume patterns. While previous work presents machine learning algorithms for prediction of produce prices, they suffer from two limitations: (i) they do not explicitly consider the spatio-temporal dependence of future prices on past data; and as a result, (ii) they rely on classical ML prediction models which often perform poorly when applied to spatio-temporal datasets. PECAD addresses these limitations via three major contributions: (i) we gather real-world daily price and (produced) volume data of different crops over a period of 11 years from an official Indian government administered website; (ii) we pre-process this raw dataset via state-of-the-art imputation techniques to account for missing data entries; and (iii) PECAD proposes a novel wide and deep neural network architecture which consists of two separate convolutional neural network models (trained for pricing and volume data respectively). Our simulation results show that PECAD outperforms existing state-of-the-art baseline methods by achieving significantly lesser root mean squared error (RMSE) - PECAD achieves ∼25% lesser coefficient of variance than state-of-the-art baselines. Our work is done in collaboration with a non-profit agency that works on preventing farmer suicides in the Indian state of Jharkhand, and PECAD is currently being reviewed by them for potential deployment.
Over a century separates initial lead service lateral installations from the federal regulation of lead in drinking water. As such, municipalities often do not have adequate information describing installations of lead plumbing. Municipalities thus face challenges such as reducing exposure to lead in drinking water, spreading scarce resources for gathering information, adopting short-term protection measures (e.g., providing filters), and developing longer-term prevention strategies (e.g., replacing lead laterals). Given the spatial and temporal patterns to properties, machine learning is seen as a useful tool to reduce uncertainty in decision making by authorities when addressing lead in water. The Pittsburgh Water and Sewer Authority (PWSA) is currently addressing these challenges in Pittsburgh and this paper describes the development and application of a model predicting high tap water concentrations (> 15 ppb) for PWSA customers. The model was developed using spatial cross validation to support PWSA’s interest in applying predictions in areas without training data. The model’s AUROC is 71.6% and primarily relies on publicly available property tax assessment data and indicators of lateral material collected by PWSA as they meet regulatory requirements.
Cooking recipes play an important role in promoting a healthy lifestyle, and a vast number of user-generated recipes are currently available on the Internet. Allied to this growth in the amount of information is an increase in the number of studies on the use of such data for recipe analysis, recipe generation, and recipe search. However, there have been few attempts to estimate the number of calories per serving in a recipe. This study considers this task and introduces two challenging subtasks: ingredient normalization and serving estimation. The ingredient normalization task aims to convert the ingredients written in a recipe (e.g.,), which says “sesame oil (for finishing)” in Japanese) into their canonical forms (e.g., , sesame oil) so that their calorific content can be looked up in an ingredient dictionary. The serving estimation task aims to convert the amount written in the recipe (e.g., N, N pieces) into the number of servings (e.g., M, M people), thus enabling the calories per serving to be calculated. We apply machine learning-based methods to these tasks and describe their practical deployment in Cookpad, the largest recipe service in the world. A series of experiments demonstrate that the performance of our methods is sufficient for use in real-world services.
A wealth of medical knowledge has been encoded in terminologies like SNOMED CT, NCI, FMA, and more. However, these resources are usually lacking information like relations between diseases, symptoms, and risk factors preventing their use in diagnostic or other decision making applications. In this paper we present a pipeline for extracting such information from unstructured text and enriching medical knowledge bases. Our approach uses Semantic Role Labelling and is unsupervised. We show how we dealt with several deficiencies of SRL-based extraction, like copula verbs, relations expressed through nouns, and assigning scores to extracted triples. The system have so far extracted about 120K relations and in-house doctors verified about 5k relationships. We compared the output of the system with a manually constructed network of diseases, symptoms and risk factors build by doctors in the course of a year. Our results show that our pipeline extracts good quality and precise relations and speeds up the knowledge acquisition process considerably.