| Total: 14
The heavy burdens of computation and off-chip traffic impede deploying the large scale convolution neural network on embedded platforms. As CNN is attributed to the strong endurance to computation errors, employing block floating point (BFP) arithmetics in CNN accelerators could save the hardware cost and data traffics efficiently, while maintaining the classification accuracy. In this paper, we verify the effects of word width definitions in BFP to the CNN performance without retraining. Several typical CNN models, including VGG16, ResNet-18, ResNet-50 and GoogLeNet, were tested in this paper. Experiments revealed that 8-bit mantissa, including sign bit, in BFP representation merely induced less than 0.3% accuracy loss. In addition, we investigate the computational errors in theory and develop the noise-to-signal ratio (NSR) upper bound, which provides the promising guidance for BFP based CNN engine design.
Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities. The availability of rich contextual information requires a nimble learning scheme that tightly integrates with deep neural networks and has the ability to capture correlation structures among exponentially many outcomes. We propose MEDL_CVAE, which encodes a conditional multivariate distribution as a generating process. As a result, the variational lower bound of the joint likelihood can be optimized via a conditional variational auto-encoder and trained end-to-end on GPUs. Our MEDL_CVAE was motivated by two real-world applications in computational sustainability: one studies the spatial correlation among multiple bird species using the eBird data and the other models multi-dimensional landscape composition and human footprint in the Amazon rainforest with satellite images. We show that MEDL_CVAE captures rich dependency structures, scales better than previous methods, and further improves on the joint likelihood taking advantage of very large datasets that are beyond the capacity of previous methods.
Peer grading, allowing students/peers to evaluate others' assignments, offers a promising solution for scaling evaluation and learning to large-scale educational systems. A key challenge in peer grading is motivating peers to grade diligently. While existing spot-checking (SC) mechanisms can prevent peer collusion where peers coordinate to report the uninformative grade, they unrealistically assume that peers have the same grading reliability and cost. This paper studies the general Optimal Spot-Checking (OptSC) problem of determining the probability each assignment needs to be checked to maximize assignments' evaluation accuracy aggregated from peers, and takes into consideration 1) peers' heterogeneous characteristics, and 2) peers' strategic grading behaviors to maximize their own utility. We prove that the bilevel OptSC is NP-hard to solve. By exploiting peers' grading behaviors, we first formulate a single level relaxation to approximate OptSC. By further exploiting structural properties of the relaxed problem, we propose an efficient algorithm to that relaxation, which also gives a good approximation of the original OptSC. Extensive experiments on both synthetic and real datasets show significant advantages of the proposed algorithm over existing approaches.
To alleviate traffic congestion in urban areas, electronic toll collection (ETC) systems are deployed all over the world. Despite the merits, tolls are usually pre-determined and fixed from day to day, which fail to consider traffic dynamics and thus have limited regulation effect when traffic conditions are abnormal. In this paper, we propose a novel dynamic ETC (DyETC) scheme which adjusts tolls to traffic conditions in realtime. The DyETC problem is formulated as a Markov decision process (MDP), the solution of which is very challenging due to its 1) multi-dimensional state space, 2) multi-dimensional, continuous and bounded action space, and 3) time-dependent state and action values. Due to the complexity of the formulated MDP, existing methods cannot be applied to our problem. Therefore, we develop a novel algorithm, PG-beta, which makes three improvements to traditional policy gradient method by proposing 1) time-dependent value and policy functions, 2) Beta distribution policy function and 3) state abstraction. Experimental results show that, compared with existing ETC schemes, DyETC increases traffic volume by around 8%, and reduces travel time by around 14:6% during rush hour. Considering the total traffic volume in a traffic network, this contributes to a substantial increase to social welfare.
Big human mobility data are being continuously generated through a variety of sources, some of which can be treated and used as streaming data for understanding and predicting urban dynamics. With such streaming mobility data, the online prediction of short-term human mobility at the city level can be of great significance for transportation scheduling, urban regulation, and emergency management. In particular, when big rare events or disasters happen, such as large earthquakes or severe traffic accidents, people change their behaviors from their routine activities. This means people's movements will almost be uncorrelated with their past movements. Therefore, in this study, we build an online system called DeepUrbanMomentum to conduct the next short-term mobility predictions by using (the limited steps of) currently observed human mobility data. A deep-learning architecture built with recurrent neural networks is designed to effectively model these highly complex sequential data for a huge urban area. Experimental results demonstrate the superior performance of our proposed model as compared to the existing approaches. Lastly, we apply our system to a real emergency scenario and demonstrate that our system is applicable in the real world.
Modern mobile networks are facing unprecedented growth in demand due to a new class of traffic from Internet of Things (IoT) devices such as smart wearables and autonomous cars. Future networks must schedule delay-tolerant software updates, data backup, and other transfers from IoT devices while maintaining strict service guarantees for conventional real-time applications such as voice-calling and video. This problem is extremely challenging because conventional traffic is highly dynamic across space and time, so its performance is significantly impacted if all IoT traffic is scheduled immediately when it originates. In this paper, we present a reinforcement learning (RL) based scheduler that can dynamically adapt to traffic variation, and to various reward functions set by network operators, to optimally schedule IoT traffic. Using 4 weeks of real network data from downtown Melbourne, Australia spanning diverse traffic patterns, we demonstrate that our RL scheduler can enable mobile networks to carry 14.7% more data with minimal impact on existing traffic, and outpeforms heuristic schedulers by more than 2x. Our work is a valuable step towards designing autonomous, "self-driving" networks that learn to manage themselves from past data.
Cascades represent rapid changes in networks. A cascading phenomenon of ecological and economic impact is the spread of invasive species in geographic landscapes. The most promising management strategy is often biocontrol, which entails introducing a natural predator able to control the invading population, a setting that can be treated as two interacting cascades of predator and prey populations. We formulate and study a nonlinear problem of optimal biocontrol: optimally seeding the predator cascade over time to minimize the harmful prey population. Recurring budgets, which typically face conservation organizations, naturally leads to sparse constraints which make the problem amenable to approximation algorithms. Available methods based on continuous relaxations scale poorly, to remedy this we develop a novel and scalable randomized algorithm based on a width relaxation, applicable to a broad class of combinatorial optimization problems. We evaluate our contributions in the context of biocontrol for the insect pest Hemlock Wolly Adelgid (HWA) in eastern North America. Our algorithm outperforms competing methods in terms of scalability and solution quality and finds near-optimal strategies for the control of the HWA for fine-grained networks -- an important problem in computational sustainability.
Treatable infectious diseases are a critical challenge for public health. Outreach campaigns can encourage undiagnosed patients to seek treatment but must be carefully targeted to make the most efficient use of limited resources. We present an algorithm to optimally allocate limited outreach resources among demographic groups in the population. The algorithm uses a novel multiagent model of disease spread which both captures the underlying population dynamics and is amenable to optimization. Our algorithm extends, with provable guarantees, to a stochastic setting where we have only a distribution over parameters such as the contact pattern between agents. We evaluate our algorithm on two instances where this distribution is inferred from real world data: tuberculosis in India and gonorrhea in the United States. Our algorithm produces a policy which is predicted to avert an average of least 8,000 person-years of tuberculosis and 20,000 person-years of gonorrhea annually compared to current policy.
The learning problem for Factorial Hidden Markov Models with discrete and multi-variate latent variables remains a challenge. Inference of the latent variables required for the E-step of Expectation Minimization algorithms is usually computationally intractable. In this paper we propose a variational learning algorithm mimicking the Baum-Welch algorithm. By approximating the filtering distribution with a variational distribution parameterized by a recurrent neural network, the computational complexity of the learning problem as a function of the number of hidden states can be reduced to quasilinear instead of quadratic time as required by traditional algorithms such as Baum-Welch whilst making minimal independence assumptions. We evaluate the performance of the resulting algorithm, which we call Variational BOLT, in the context of unsupervised end-to-end energy disaggregation. We conduct experiments on the publicly available REDD dataset and show competitive results when compared with a supervised inference approach and state-of-the-art results in an unsupervised setting.
Homes constitute roughly one-third of the total energy usage worldwide. Providing an energy breakdown – energy consumption per appliance, can help save up to 15% energy. Given the vast differences in energy consumption patterns across different regions, existing energy breakdown solutions require instrumentation and model training for each geographical region, which is prohibitively expensive and limits the scalability. In this paper, we propose a novel region independent energy breakdown model via statistical transfer learning. Our key intuition is that the heterogeneity in homes and weather across different regions most significantly impacts the energy consumption across regions; and if we can factor out such heterogeneity, we can learn region independent models or the homogeneous energy breakdown components for each individual appliance. Thus, the model learnt in one region can be transferred to another region. We evaluate our approach on two U.S. cities having distinct weather from a publicly available dataset. We find that our approach gives better energy breakdown estimates requiring the least amount of instrumented homes from the target region, when compared to the state-of-the-art.
Predicting epidemic dynamics is of great value in understanding and controlling diffusion processes, such as infectious disease spread and information propagation. This task is intractable, especially when surveillance resources are very limited. To address the challenge, we study the problem of active surveillance, i.e., how to identify a small portion of system components as sentinels to effect monitoring, such that the epidemic dynamics of an entire system can be readily predicted from the partial data collected by such sentinels. We propose a novel measure, the gamma value, to identify the sentinels by modeling a sentinel network with row sparsity structure. We design a flexible group sparse Bayesian learning algorithm to mine the sentinel network suitable for handling both linear and non-linear dynamical systems by using the expectation maximization method and variational approximation. The efficacy of the proposed algorithm is theoretically analyzed and empirically validated using both synthetic and real-world data.
Plant-pollinator interaction networks are bipartite networks representing the mutualistic interactions between a set of plant species and a set of pollinator species. Data on these networks are collected by field biologists, who count visits from pollinators to flowers. Ecologists study the structure and function of these networks for scientific, conservation, and agricultural purposes. However, little research has been done to understand the underlying mechanisms that determine pairwise interactions or to predict new links from networks describing the species community. This paper explores the use of latent factor models to predict interactions that will occur in new contexts (e.g. a different distribution of the set of plant species) based on an observed network. The analysis draws on algorithms and evaluation strategies developed for recommendation systems and introduces them to this new domain. The matrix factorization methods compare favorably against several baselines on a pollination dataset collected in montane meadows over several years. Incorporating both positive and negative implicit feedback into the matrix factorization methods is particularly promising.
Plant-pollinator interaction networks are bipartite networks representing the mutualistic interactions between a set of plant species and a set of pollinator species. Data on these networks are collected by field biologists, who count visits from pollinators to flowers. Ecologists study the structure and function of these networks for scientific, conservation, and agricultural purposes. However, little research has been done to understand the underlying mechanisms that determine pairwise interactions or to predict new links from networks describing the species community. This paper explores the use of latent factor models to predict interactions that will occur in new contexts (e.g. a different distribution of the set of plant species) based on an observed network. The analysis draws on algorithms and evaluation strategies developed for recommendation systems and introduces them to this new domain. The matrix factorization methods compare favorably against several baselines on a pollination dataset collected in montane meadows over several years. Incorporating both positive and negative implicit feedback into the matrix factorization methods is particularly promising.
Real-world problems are often not fully characterized by a single optimal solution, as they frequently involve multiple competing objectives; it is therefore important to identify the so-called Pareto frontier, which captures solution trade-offs. We propose a fully polynomial-time approximation scheme based on Dynamic Programming (DP) for computing a polynomially succinct curve that approximates the Pareto frontier to within an arbitrarily small epsilon > 0 on tree-structured networks. Given a set of objectives, our approximation scheme runs in time polynomial in the size of the instance and 1/epsilon. We also propose a Mixed Integer Programming (MIP) scheme to approximate the Pareto frontier. The DP and MIP Pareto frontier approaches have complementary strengths and are surprisingly effective. We provide empirical results showing that our methods outperform other approaches in efficiency and accuracy. Our work is motivated by a problem in computational sustainability concerning the proliferation of hydropower dams throughout the Amazon basin. Our goal is to support decision-makers in evaluating impacted ecosystem services on the full scale of the Amazon basin. Our work is general and can be applied to approximate the Pareto frontier of a variety of multiobjective problems on tree-structured networks.