AAAI.2018 - Human-Computation and Crowd Sourcing

| Total: 10

#1 Deep Learning from Crowds [PDF] [Copy] [Kimi] [REL]

Authors: Filipe Rodrigues, Francisco Pereira

Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so does the need for larger labeled datasets. Recently, crowdsourcing has established itself as an efficient and cost-effective solution for labeling large sets of data in a scalable manner, but it often requires aggregating labels from multiple noisy contributors with different levels of expertise. In this paper, we address the problem of learning deep neural networks from crowds. We begin by describing an EM algorithm for jointly learning the parameters of the network and the reliabilities of the annotators. Then, a novel general-purpose crowd layer is proposed, which allows us to train deep neural networks end-to-end, directly from the noisy labels of multiple annotators, using only backpropagation. We empirically show that the proposed approach is able to internally capture the reliability and biases of different annotators and achieve new state-of-the-art results for various crowdsourced datasets across different settings, namely classification, regression and sequence labeling.


#2 Adversarial Learning for Chinese NER From Crowd Annotations [PDF] [Copy] [Kimi] [REL]

Authors: YaoSheng Yang, Meishan Zhang, Wenliang Chen, Wei Zhang, Haofen Wang, Min Zhang

To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time. But as an exchange, crowd annotations from non-experts may be of lower quality than those from experts. In this paper, we propose an approach to performing crowd annotation learning for Chinese Named Entity Recognition (NER) to make full use of the noisy sequence labels from multiple annotators. Inspired by adversarial learning, our approach uses a common Bi-LSTM and a private Bi-LSTM for representing annotator-generic and -specific information. The annotator-generic information is the common knowledge for entities easily mastered by the crowd. Finally, we build our Chinese NE tagger based on the LSTM-CRF model. In our experiments, we create two data sets for Chinese NER tasks from two domains. The experimental results show that our system achieves better scores than strong baseline systems.


#3 Understanding Over Participation in Simple Contests [PDF] [Copy] [Kimi] [REL]

Authors: Priel Levy, David Sarne

One key motivation for using contests in real-life is the substantial evidence reported in empirical contest-design literature for people's tendency to act more competitively in contests than predicted by the Nash Equilibrium. This phenomenon has been traditionally explained by people's eagerness to win and maximize their relative (rather than absolute) payoffs. In this paper we make use of "simple contests," where contestants only need to strategize on whether to participate in the contest or not, as an infrastructure for studying whether indeed more effort is exerted in contests due to competitiveness, or perhaps this can be attributed to other factors that hold also in non-competitive settings. The experimental methodology we use compares contestants' participation decisions in eight contest settings differing in the nature of the contest used, the number of contestants used and the theoretical participation predictions to those obtained (whenever applicable) by subjects facing equivalent non-competitive decision situations in the form of a lottery. We show that indeed people tend to over-participate in contests compared to the theoretical predictions, yet the same phenomenon holds (to a similar extent) also in the equivalent non-competitive settings. Meaning that many of the contests used nowadays as a means for inducing extra human effort, that are often complex to organize and manage, can be replaced by a simpler non-competitive mechanism that uses probabilistic prizes.


#4 AdaFlock: Adaptive Feature Discovery for Human-in-the-loop Predictive Modeling [PDF] [Copy] [Kimi] [REL]

Authors: Ryusuke Takahama, Yukino Baba, Nobuyuki Shimizu, Sumio Fujita, Hisashi Kashima

Feature engineering is the key to successful application of machine learning algorithms to real-world data. The discovery of informative features often requires domain knowledge or human inspiration, and data scientists expend a certain amount of effort into exploring feature spaces. Crowdsourcing is considered a promising approach for allowing many people to be involved in feature engineering; however, there is a demand for a sophisticated strategy that enables us to acquire good features at a reasonable crowdsourcing cost. In this paper, we present a novel algorithm called AdaFlock to efficiently obtain informative features through crowdsourcing. AdaFlock is inspired by AdaBoost, which iteratively trains classifiers by increasing the weights of samples misclassified by previous classifiers. AdaFlock iteratively generates informative features; at each iteration of AdaFlock, crowdsourcing workers are shown samples selected according to the classification errors of the current classifiers and are asked to generate new features that are helpful for correctly classifying the given examples. The results of our experiments conducted using real datasets indicate that AdaFlock successfully discovers informative features with fewer iterations and achieves high classification accuracy.


#5 Information Gathering With Peers: Submodular Optimization With Peer-Prediction Constraints [PDF] [Copy] [Kimi] [REL]

Authors: Goran Radanovic, Adish Singla, Andreas Krause, Boi Faltings

We study a problem of optimal information gathering from multiple data providers that need to be incentivized to provide accurate information. This problem arises in many real world applications that rely on crowdsourced data sets, but where the process of obtaining data is costly. A notable example of such a scenario is crowd sensing. To this end, we formulate the problem of optimal information gathering as maximization of a submodular function under a budget constraint, where the budget represents the total expected payment to data providers. Contrary to the existing approaches, we base our payments on incentives for accuracy and truthfulness, in particular, peer prediction methods that score each of the selected data providers against its best peer, while ensuring that the minimum expected payment is above a given threshold. We first show that the problem at hand is hard to approximate within a constant factor that is not dependent on the properties of the payment function. However, for given topological and analytical properties of the instance, we construct two greedy algorithms, respectively called PPCGreedy and PPCGreedyIter, and establish theoretical bounds on their performance w.r.t. the optimal solution. Finally, we evaluate our methods using a realistic crowd sensing testbed.


#6 Partial Truthfulness in Minimal Peer Prediction Mechanisms With Limited Knowledge [PDF] [Copy] [Kimi] [REL]

Authors: Goran Radanovic, Boi Faltings

We study minimal single-task peer prediction mechanisms that have limited knowledge about agents' beliefs. Without knowing what agents' beliefs are or eliciting additional information, it is not possible to design a truthful mechanism in a Bayesian-Nash sense. We go beyond truthfulness and explore equilibrium strategy profiles that are only partially truthful. Using the results from the multi-armed bandit literature, we give a characterization of how inefficient these equilibria are comparing to truthful reporting. We measure the inefficiency of such strategies by counting the number of dishonest reports that any minimal knowledge-bounded mechanism must have. We show that the order of this number is θ(log n), where n is the number of agents, and we provide a peer prediction mechanism that achieves this bound in expectation.


#7 A Voting-Based System for Ethical Decision Making [PDF] [Copy] [Kimi] [REL]

Authors: Ritesh Noothigattu, Snehalkumar Gaikwad, Edmond Awad, Sohan Dsouza, Iyad Rahwan, Pradeep Ravikumar, Ariel Procaccia

We present a general approach to automating ethical decisions, drawing on machine learning and computational social choice. In a nutshell, we propose to learn a model of societal preferences, and, when faced with a specific ethical dilemma at runtime, efficiently aggregate those preferences to identify a desirable choice. We provide a concrete algorithm that instantiates our approach; some of its crucial steps are informed by a new theory of swap-dominance efficient voting rules. Finally, we implement and evaluate a system for ethical decision making in the autonomous vehicle domain, using preference data collected from 1.3 million people through the Moral Machine website.


#8 Semi-Supervised Learning From Crowds Using Deep Generative Models [PDF] [Copy] [Kimi] [REL]

Authors: Kyohei Atarashi, Satoshi Oyama, Masahito Kurihara

Although supervised learning requires a labeled dataset, obtaining labels from experts is generally expensive. For this reason, crowdsourcing services are attracting attention in the field of machine learning as a way to collect labels at relatively low cost. However, the labels obtained by crowdsourcing, i.e., from non-expert workers, are often noisy. A number of methods have thus been devised for inferring true labels, and several methods have been proposed for learning classifiers directly from crowdsourced labels, referred to as "learning from crowds." A more practical problem is learning from crowdsourced labeled data and unlabeled data, i.e., "semi-supervised learning from crowds." This paper presents a novel generative model of the labeling process in crowdsourcing. It leverages unlabeled data effectively by introducing latent features and a data distribution. Because the data distribution can be complicated, we use a deep neural network for the data distribution. Therefore, our model can be regarded as a kind of deep generative model. The problems caused by the intractability of latent variable posteriors is solved by introducing an inference model. The experiments show that it outperforms four existing models, including a baseline model, on the MNIST dataset with simulated workers and the Rotten Tomatoes movie review dataset with Amazon Mechanical Turk workers.


#9 Understanding Social Interpersonal Interaction via Synchronization Templates of Facial Events [PDF] [Copy] [Kimi] [REL]

Authors: Rui Li, Jared Curhan, Mohammed Hoque

Automatic facial expression analysis in inter-personal communication is challenging. Not only because conversation partners' facial expressions mutually influence each other, but also because no correct interpretation of facial expressions is possible without taking social context into account. In this paper, we propose a probabilistic framework to model interactional synchronization between conversation partners based on their facial expressions. Interactional synchronization manifests temporal dynamics of conversation partners' mutual influence. In particular, the model allows us to discover a set of common and unique facial synchronization templates directly from natural interpersonal interaction without recourse to any predefined labeling schemes. The facial synchronization templates represent periodical facial event coordinations shared by multiple conversation pairs in a specific social context. We test our model on two different dyadic conversations of negotiation and job-interview. Based on the discovered facial event coordination, we are able to predict their conversation outcomes with higher accuracy than HMMs and GMMs.


#10 Sentiment Analysis via Deep Hybrid Textual-Crowd Learning Model [PDF] [Copy] [Kimi] [REL]

Authors: Kamran Ghasedi Dizaji, Heng Huang

Crowdsourcing technique provides an efficient platform to employ human skills in sentiment analysis, which is a difficult task for automatic language models due to the large variations in context, writing style, view point and so on. However, the standard crowdsourcing aggregation models are incompetent when the number of crowd labels per worker is not sufficient to train parameters, or when it is not feasible to collect labels for each sample in a large dataset. In this paper, we propose a novel hybrid model to exploit both crowd and text data for sentiment analysis, consisting of a generative crowdsourcing aggregation model and a deep sentimental autoencoder. Combination of these two sub-models is obtained based on a probabilistic framework rather than a heuristic way. We introduce a unified objective function to incorporate the objectives of both sub-models, and derive an efficient optimization algorithm to jointly solve the corresponding problem. Experimental results indicate that our model achieves superior results in comparison with the state-of-the-art models, especially when the crowd labels are scarce.