AAAI.2020

Total: 1865

#1 Balancing Spreads of Influence in a Social Network [PDF] [Copy] [Kimi]

Authors: Ruben Becker ; Federico Corò ; Gianlorenzo D'Angelo ; Hugo Gilbert

The personalization of our news consumption on social media has a tendency to reinforce our pre-existing beliefs instead of balancing our opinions. To tackle this issue, Garimella et al. (NIPS'17) modeled the spread of these viewpoints, also called campaigns, using the independent cascade model introduced by Kempe, Kleinberg and Tardos (KDD'03) and studied an optimization problem that aims to balance information exposure when two opposing campaigns propagate in a network. This paper investigates a natural generalization of this optimization problem in which μ different campaigns propagate in the network and we aim to maximize the expected number of nodes that are reached by at least ν or none of the campaigns, where μ ≥ ν ≥ 2. Following Garimella et al., despite this general setting, we also investigate a simplified one, in which campaigns propagate in a correlated manner. While for the simplified setting, we show that the problem can be approximated within a constant factor for any constant μ and ν, for the general setting, we give reductions leading to several approximation hardness results when ν ≥ 3. For instance, assuming the gap exponential time hypothesis to hold, we obtain that the problem cannot be approximated within a factor of n−g(n) for any g(n) = o(1) where n is the number of nodes in the network. We complement our hardness results with an Ω(n−1/2)-approximation algorithm for the general setting when ν = 3 and μ is arbitrary.

#2 MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization [PDF] [Copy] [Kimi]

Authors: Yue Cao ; Xiaojun Wan ; Jinge Yao ; Dian Yu

Automatic text summarization aims at producing a shorter version of the input text that conveys the most important information. However, multi-lingual text summarization, where the goal is to process texts in multiple languages and output summaries in the corresponding languages with a single model, has been rarely studied. In this paper, we present MultiSumm, a novel multi-lingual model for abstractive summarization. The MultiSumm model uses the following training regime: (I) multi-lingual learning that contains language model training, auto-encoder training, translation and back-translation training, and (II) joint summary generation training. We conduct experiments on summarization datasets for five rich-resource languages: English, Chinese, French, Spanish, and German, as well as two low-resource languages: Bosnian and Croatian. Experimental results show that our proposed model significantly outperforms a multi-lingual baseline model. Specifically, our model achieves comparable or even better performance than models trained separately on each language. As an additional contribution, we construct the first summarization dataset for Bosnian and Croatian, containing 177,406 and 204,748 samples, respectively.

#3 Efficient Heterogeneous Collaborative Filtering without Negative Sampling for Recommendation [PDF] [Copy] [Kimi]

Authors: Chong Chen ; Min Zhang ; Yongfeng Zhang ; Weizhi Ma ; Yiqun Liu ; Shaoping Ma

Recent studies on recommendation have largely focused on exploring state-of-the-art neural networks to improve the expressiveness of models, while typically apply the Negative Sampling (NS) strategy for efficient learning. Despite effectiveness, two important issues have not been well-considered in existing methods: 1) NS suffers from dramatic fluctuation, making sampling-based methods difficult to achieve the optimal ranking performance in practical applications; 2) although heterogeneous feedback (e.g., view, click, and purchase) is widespread in many online systems, most existing methods leverage only one primary type of user feedback such as purchase. In this work, we propose a novel non-sampling transfer learning solution, named Efficient Heterogeneous Collaborative Filtering (EHCF) for Top-N recommendation. It can not only model fine-grained user-item relations, but also efficiently learn model parameters from the whole heterogeneous data (including all unlabeled data) with a rather low time complexity. Extensive experiments on three real-world datasets show that EHCF significantly outperforms state-of-the-art recommendation methods in both traditional (single-behavior) and heterogeneous scenarios. Moreover, EHCF shows significant improvements in training efficiency, making it more applicable to real-world large-scale systems. Our implementation has been released 1 to facilitate further developments on efficient whole-data based neural methods.

#4 Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach [PDF] [Copy] [Kimi]

Authors: Lei Chen ; Le Wu ; Richang Hong ; Kun Zhang ; Meng Wang

Graph Convolutional Networks~(GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering~(CF) based Recommender Systems~(RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LR-GCCF.

#5 Question-Driven Purchasing Propensity Analysis for Recommendation [PDF] [Copy] [Kimi]

Authors: Long Chen ; Ziyu Guan ; Qibin Xu ; Qiong Zhang ; Huan Sun ; Guangyue Lu ; Deng Cai

Merchants of e-commerce Websites expect recommender systems to entice more consumption which is highly correlated with the customers' purchasing propensity. However, most existing recommender systems focus on customers' general preference rather than purchasing propensity often governed by instant demands which we deem to be well conveyed by the questions asked by customers. A typical recommendation scenario is: Bob wants to buy a cell phone which can play the game PUBG. He is interested in HUAWEI P20 and asks “can PUBG run smoothly on this phone?” under it. Then our system will be triggered to recommend the most eligible cell phones to him. Intuitively, diverse user questions could probably be addressed in reviews written by other users who have similar concerns. To address this recommendation problem, we propose a novel Question-Driven Attentive Neural Network (QDANN) to assess the instant demands of questioners and the eligibility of products based on user generated reviews, and do recommendation accordingly. Without supervision, QDANN can well exploit reviews to achieve this goal. The attention mechanisms can be used to provide explanations for recommendations. We evaluate QDANN in three domains of Taobao. The results show the efficacy of our method and its superiority over baseline methods.

#6 Gradient Method for Continuous Influence Maximization with Budget-Saving Considerations [PDF] [Copy] [Kimi]

Authors: Wei Chen ; Weizhong Zhang ; Haoyu Zhao

Continuous influence maximization (CIM) generalizes the original influence maximization by incorporating general marketing strategies: a marketing strategy mix is a vector x = (x1, …, xd) such that for each node v in a social network, v could be activated as a seed of diffusion with probability hv(x), where hv is a strategy activation function satisfying DR-submodularity. CIM is the task of selecting a strategy mix x with constraint ∑ixi ≤ k where k is a budget constraint, such that the total number of activated nodes after the diffusion process, called influence spread and denoted as g(x), is maximized. In this paper, we extend CIM to consider budget saving, that is, each strategy mix x has a cost c(x) where c is a convex cost function, and we want to maximize the balanced sum g(x) + λ(k − c(x)) where λ is a balance parameter, subject to the constraint of c(x) ≤ k. We denote this problem as CIM-BS. The objective function of CIM-BS is neither monotone, nor DR-submodular or concave, and thus neither the greedy algorithm nor the standard result on gradient method could be directly applied. Our key innovation is the combination of the gradient method with reverse influence sampling to design algorithms that solve CIM-BS: For the general case, we give an algorithm that achieves (½ − ε)-approximation, and for the case of independent strategy activations, we present an algorithm that achieves (1 − 1/e − ε) approximation.

#7 Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search [PDF] [Copy] [Kimi]

Authors: Xinyan Dai ; Xiao Yan ; Kelvin K. W. Ng ; Jiu Liu ; James Cheng

Vector quantization (VQ) techniques are widely used in similarity search for data compression, computation acceleration and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) — a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.

#8 Modeling Fluency and Faithfulness for Diverse Neural Machine Translation [PDF] [Copy] [Kimi]

Authors: Yang Feng ; Wanying Xie ; Shuhao Gu ; Chenze Shao ; Wen Zhang ; Zhengxin Yang ; Dong Yu

Neural machine translation models usually adopt the teacher forcing strategy for training which requires the predicted sequence matches ground truth word by word and forces the probability of each prediction to approach a 0-1 distribution. However, the strategy casts all the portion of the distribution to the ground truth word and ignores other words in the target vocabulary even when the ground truth word cannot dominate the distribution. To address the problem of teacher forcing, we propose a method to introduce an evaluation module to guide the distribution of the prediction. The evaluation module accesses each prediction from the perspectives of fluency and faithfulness to encourage the model to generate the word which has a fluent connection with its past and future translation and meanwhile tends to form a translation equivalent in meaning to the source. The experiments on multiple translation tasks show that our method can achieve significant improvements over strong baselines.

#9 Leveraging Title-Abstract Attentive Semantics for Paper Recommendation [PDF] [Copy] [Kimi]

Authors: Guibing Guo ; Bowei Chen ; Xiaoyan Zhang ; Zhirong Liu ; Zhenhua Dong ; Xiuqiang He

Paper recommendation is a research topic to provide users with personalized papers of interest. However, most existing approaches equally treat title and abstract as the input to learn the representation of a paper, ignoring their semantic relationship. In this paper, we regard the abstract as a sequence of sentences, and propose a two-level attentive neural network to capture: (1) the ability of each word within a sentence to reflect if it is semantically close to the words within the title. (2) the extent of each sentence in the abstract relative to the title, which is often a good summarization of the abstract document. Specifically, we propose a Long-Short Term Memory (LSTM) network with attention to learn the representation of sentences, and integrate a Gated Recurrent Unit (GRU) network with a memory network to learn the long-term sequential sentence patterns of interacted papers for both user and item (paper) modeling. We conduct extensive experiments on two real datasets, and show that our approach outperforms other state-of-the-art approaches in terms of accuracy.

#10 Preserving Ordinal Consensus: Towards Feature Selection for Unlabeled Data [PDF] [Copy] [Kimi]

Authors: Jun Guo ; Heng Chang ; Wenwu Zhu

To better pre-process unlabeled data, most existing feature selection methods remove redundant and noisy information by exploring some intrinsic structures embedded in samples. However, these unsupervised studies focus too much on the relations among samples, totally neglecting the feature-level geometric information. This paper proposes an unsupervised triplet-induced graph to explore a new type of potential structure at feature level, and incorporates it into simultaneous feature selection and clustering. In the feature selection part, we design an ordinal consensus preserving term based on a triplet-induced graph. This term enforces the projection vectors to preserve the relative proximity of original features, which contributes to selecting more relevant features. In the clustering part, Self-Paced Learning (SPL) is introduced to gradually learn from ‘easy’ to ‘complex’ samples. SPL alleviates the dilemma of falling into the bad local minima incurred by noise and outliers. Specifically, we propose a compelling regularizer for SPL to obtain a robust loss. Finally, an alternating minimization algorithm is developed to efficiently optimize the proposed model. Extensive experiments on different benchmark datasets consistently demonstrate the superiority of our proposed method.

#11 An Attentional Recurrent Neural Network for Personalized Next Location Recommendation [PDF] [Copy] [Kimi]

Authors: Qing Guo ; Zhu Sun ; Jie Zhang ; Yin-Leng Theng

Most existing studies on next location recommendation propose to model the sequential regularity of check-in sequences, but suffer from the severe data sparsity issue where most locations have fewer than five following locations. To this end, we propose an Attentional Recurrent Neural Network (ARNN) to jointly model both the sequential regularity and transition regularities of similar locations (neighbors). In particular, we first design a meta-path based random walk over a novel knowledge graph to discover location neighbors based on heterogeneous factors. A recurrent neural network is then adopted to model the sequential regularity by capturing various contexts that govern user mobility. Meanwhile, the transition regularities of the discovered neighbors are integrated via the attention mechanism, which seamlessly cooperates with the sequential regularity as a unified recurrent framework. Experimental results on multiple real-world datasets demonstrate that ARNN outperforms state-of-the-art methods.

#12 Re-Attention for Visual Question Answering [PDF] [Copy] [Kimi]

Authors: Wenya Guo ; Ying Zhang ; Xiaoping Wu ; Jufeng Yang ; Xiangrui Cai ; Xiaojie Yuan

Visual Question Answering~(VQA) requires a simultaneous understanding of images and questions. Existing methods achieve well performance by focusing on both key objects in images and key words in questions. However, the answer also contains rich information which can help to better describe the image and generate more accurate attention maps. In this paper, to utilize the information in answer, we propose a re-attention framework for the VQA task. We first associate image and question by calculating the similarity of each object-word pairs in the feature space. Then, based on the answer, the learned model re-attends the corresponding visual objects in images and reconstructs the initial attention map to produce consistent results. Benefiting from the re-attention procedure, the question can be better understood, and the satisfactory answer is generated. Extensive experiments on the benchmark dataset demonstrate the proposed method performs favorably against the state-of-the-art approaches.

#13 Semi-Supervised Multi-Modal Learning with Balanced Spectral Decomposition [PDF] [Copy] [Kimi]

Authors: Peng Hu ; Hongyuan Zhu ; Xi Peng ; Jie Lin

Cross-modal retrieval aims to retrieve the relevant samples across different modalities, of which the key problem is how to model the correlations among different modalities while narrowing the large heterogeneous gap. In this paper, we propose a Semi-supervised Multimodal Learning Network method (SMLN) which correlates different modalities by capturing the intrinsic structure and discriminative correlation of the multimedia data. To be specific, the labeled and unlabeled data are used to construct a similarity matrix which integrates the cross-modal correlation, discrimination, and intra-modal graph information existing in the multimedia data. What is more important is that we propose a novel optimization approach to optimize our loss within a neural network which involves a spectral decomposition problem derived from a ratio trace criterion. Our optimization enjoys two advantages given below. On the one hand, the proposed approach is not limited to our loss, which could be applied to any case that is a neural network with the ratio trace criterion. On the other hand, the proposed optimization is different from existing ones which alternatively maximize the minor eigenvalues, thus overemphasizing the minor eigenvalues and ignore the dominant ones. In contrast, our method will exactly balance all eigenvalues, thus being more competitive to existing methods. Thanks to our loss and optimization strategy, our method could well preserve the discriminative and instinct information into the common space and embrace the scalability in handling large-scale multimedia data. To verify the effectiveness of the proposed method, extensive experiments are carried out on three widely-used multimodal datasets comparing with 13 state-of-the-art approaches.

#14 MuMod: A Micro-Unit Connection Approach for Hybrid-Order Community Detection [PDF] [Copy] [Kimi]

Authors: Ling Huang ; Hong-Yang Chao ; Quangqiang Xie

In the past few years, higher-order community detection has drawn an increasing amount of attention. Compared with the lower-order approaches that rely on the connectivity pattern of individual nodes and edges, the higher-order approaches discover communities by leveraging the higher-order connectivity pattern via constructing a motif-based hypergraph. Despite success in capturing the building blocks of complex networks, recent study has shown that the higher-order approaches unavoidably suffer from the hypergraph fragmentation issue. Although an edge enhancement strategy has been designed previously to address this issue, adding additional edges may corrupt the original lower-order connectivity pattern. To this end, this paper defines a new problem of community detection, namely hybrid-order community detection, which aims to discover communities by simultaneously leveraging the lower-order connectivity pattern and the higherorder connectivity pattern. For addressing this new problem, a new Micro-unit Modularity (MuMod) approach is designed. The basic idea lies in constructing a micro-unit connection network, where both of the lower-order connectivity pattern and the higher-order connectivity pattern are utilized. And then a new micro-unit modularity model is proposed for generating the micro-unit groups, from which the overlapping community structure of the original network can be derived. Extensive experiments are conducted on five real-world networks. Comparison results with twelve existing approaches confirm the effectiveness of the proposed method.

#15 Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation [PDF] [Copy] [Kimi]

Authors: Baijun Ji ; Zhirui Zhang ; Xiangyu Duan ; Min Zhang ; Boxing Chen ; Weihua Luo

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer learning approach based on cross-lingual pre-training. Our key idea is to make all source languages share the same feature space and thus enable a smooth transition for zero-shot translation. To this end, we introduce one monolingual pre-training method and two bilingual pre-training methods to obtain a universal encoder for different languages. Once the universal encoder is constructed, the parent model built on such encoder is trained with large-scale annotated data and then directly applied in zero-shot translation scenario. Experiments on two public datasets show that our approach significantly outperforms strong pivot-based baseline and various multilingual NMT approaches.

#16 Functionality Discovery and Prediction of Physical Objects [PDF] [Copy] [Kimi]

Authors: Lei Ji ; Botian Shi ; Xianglin Guo ; Xilin Chen

Functionality is a fundamental attribute of an object which indicates the capability to be used to perform specific actions. It is critical to empower robots the functionality knowledge in discovering appropriate objects for a task e.g. cut cake using knife. Existing research works have focused on understanding object functionality through human-object-interaction from extensively annotated image or video data and are hard to scale up. In this paper, we (1) mine object-functionality knowledge through pattern-based and model-based methods from text, (2) introduce a novel task on physical object-functionality prediction, which consumes an image and an action query to predict whether the object in the image can perform the action, and (3) propose a method to leverage the mined functionality knowledge for the new task. Our experimental results show the effectiveness of our methods.

#17 True Nonlinear Dynamics from Incomplete Networks [PDF] [Copy] [Kimi]

Authors: Chunheng Jiang ; Jianxi Gao ; Malik Magdon-Ismail

We study nonlinear dynamics on complex networks. Each vertex i has a state xi which evolves according to a networked dynamics to a steady-state xi*. We develop fundamental tools to learn the true steady-state of a small part of the network, without knowing the full network. A naive approach and the current state-of-the-art is to follow the dynamics of the observed partial network to local equilibrium. This dramatically fails to extract the true steady state. We use a mean-field approach to map the dynamics of the unseen part of the network to a single node, which allows us to recover accurate estimates of steady-state on as few as 5 observed vertices in domains ranging from ecology to social networks to gene regulation. Incomplete networks are the norm in practice, and we offer new ways to think about nonlinear dynamics when only sparse information is available.

#18 Understanding and Improving Proximity Graph Based Maximum Inner Product Search [PDF] [Copy] [Kimi]

Authors: Jie Liu ; Xiao Yan ; Xinyan Dai ; Zhirong Li ; James Cheng ; Ming-Chang Yang

The inner-product navigable small world graph (ip-NSW) represents the state-of-the-art method for approximate maximum inner product search (MIPS) and it can achieve an order of magnitude speedup over the fastest baseline. However, to date it is still unclear where its exceptional performance comes from. In this paper, we show that there is a strong norm bias in the MIPS problem, which means that the large norm items are very likely to become the result of MIPS. Then we explain the good performance of ip-NSW as matching the norm bias of the MIPS problem — large norm items have big in-degrees in the ip-NSW proximity graph and a walk on the graph spends the majority of computation on these items, thus effectively avoids unnecessary computation on small norm items. Furthermore, we propose the ip-NSW+ algorithm, which improves ip-NSW by introducing an additional angular proximity graph. Search is first conducted on the angular graph to find the angular neighbors of a query and then the MIPS neighbors of these angular neighbors are used to initialize the candidate pool for search on the inner-product proximity graph. Experiment results show that ip-NSW+ consistently and significantly outperforms ip-NSW and provides more robust performance under different data distributions.

#19 Type-Aware Anchor Link Prediction across Heterogeneous Networks Based on Graph Attention Network [PDF] [Copy] [Kimi]

Authors: Xiaoxue Li ; Yanmin Shang ; Yanan Cao ; Yangxi Li ; Jianlong Tan ; Yanbing Liu

Anchor Link Prediction (ALP) across heterogeneous networks plays a pivotal role in inter-network applications. The difficulty of anchor link prediction in heterogeneous networks lies in how to consider the factors affecting nodes alignment comprehensively. In recent years, predicting anchor links based on network embedding has become the main trend. For heterogeneous networks, previous anchor link prediction methods first integrate various types of nodes associated with a user node to obtain a fusion embedding vector from global perspective, and then predict anchor links based on the similarity between fusion vectors corresponding with different user nodes. However, the fusion vector ignores effects of the local type information on user nodes alignment. To address the challenge, we propose a novel type-aware anchor link prediction across heterogeneous networks (TALP), which models the effect of type information and fusion information on user nodes alignment from local and global perspective simultaneously. TALP can solve the network embedding and type-aware alignment under a unified optimization framework based on a two-layer graph attention architecture. Through extensive experiments on real heterogeneous network datasets, we demonstrate that TALP significantly outperforms the state-of-the-art methods.

#20 Deep Match to Rank Model for Personalized Click-Through Rate Prediction [PDF] [Copy] [Kimi]

Authors: Ze Lyu ; Yu Dong ; Chengfu Huo ; Weijun Ren

Click-through rate (CTR) prediction is a core task in the field of recommender system and many other applications. For CTR prediction model, personalization is the key to improve the performance and enhance the user experience. Recently, several models are proposed to extract user interest from user behavior data which reflects user's personalized preference implicitly. However, existing works in the field of CTR prediction mainly focus on user representation and pay less attention on representing the relevance between user and item, which directly measures the intensity of user's preference on target item. Motivated by this, we propose a novel model named Deep Match to Rank (DMR) which combines the thought of collaborative filtering in matching methods for the ranking task in CTR prediction. In DMR, we design User-to-Item Network and Item-to-Item Network to represent the relevance in two forms. In User-to-Item Network, we represent the relevance between user and item by inner product of the corresponding representation in the embedding space. Meanwhile, an auxiliary match network is presented to supervise the training and push larger inner product to represent higher relevance. In Item-to-Item Network, we first calculate the item-to-item similarities between user interacted items and target item by attention mechanism, and then sum up the similarities to obtain another form of user-to-item relevance. We conduct extensive experiments on both public and industrial datasets to validate the effectiveness of our model, which outperforms the state-of-art models significantly.

#21 Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion [PDF] [Copy] [Kimi]

Authors: Sijie Mai ; Haifeng Hu ; Songlong Xing

Learning joint embedding space for various modalities is of vital importance for multimodal fusion. Mainstream modality fusion approaches fail to achieve this goal, leaving a modality gap which heavily affects cross-modal fusion. In this paper, we propose a novel adversarial encoder-decoder-classifier framework to learn a modality-invariant embedding space. Since the distributions of various modalities vary in nature, to reduce the modality gap, we translate the distributions of source modalities into that of target modality via their respective encoders using adversarial training. Furthermore, we exert additional constraints on embedding space by introducing reconstruction loss and classification loss. Then we fuse the encoded representations using hierarchical graph neural network which explicitly explores unimodal, bimodal and trimodal interactions in multi-stage. Our method achieves state-of-the-art performance on multiple datasets. Visualization of the learned embeddings suggests that the joint embedding space learned by our method is discriminative.

#22 A Variational Point Process Model for Social Event Sequences [PDF] [Copy] [Kimi]

Authors: Zhen Pan ; Zhenya Huang ; Defu Lian ; Enhong Chen

Many events occur in real-world and social networks. Events are related to the past and there are patterns in the evolution of event sequences. Understanding the patterns can help us better predict the type and arriving time of the next event. In the literature, both feature-based approaches and generative approaches are utilized to model the event sequence. Feature-based approaches extract a variety of features, and train a regression or classification model to make a prediction. Yet, their performance is dependent on the experience-based feature exaction. Generative approaches usually assume the evolution of events follow a stochastic point process (e.g., Poisson process or its complexer variants). However, the true distribution of events is never known and the performance depends on the design of stochastic process in practice. To solve the above challenges, in this paper, we present a novel probabilistic generative model for event sequences. The model is termed Variational Event Point Process (VEPP). Our model introduces variational auto-encoder to event sequence modeling that can better use the latent information and capture the distribution over inter-arrival time and types of event sequences. Experiments on real-world datasets prove effectiveness of our proposed model.

#23 Incremental Fairness in Two-Sided Market Platforms: On Smoothly Updating Recommendations [PDF] [Copy] [Kimi]

Authors: Gourab K. Patro ; Abhijnan Chakraborty ; Niloy Ganguly ; Krishna Gummadi

Major online platforms today can be thought of as two-sided markets with producers and customers of goods and services. There have been concerns that over-emphasis on customer satisfaction by the platforms may affect the well-being of the producers. To counter such issues, few recent works have attempted to incorporate fairness for the producers. However, these studies have overlooked an important issue in such platforms -- to supposedly improve customer utility, the underlying algorithms are frequently updated, causing abrupt changes in the exposure of producers. In this work, we focus on the fairness issues arising out of such frequent updates, and argue for incremental updates of the platform algorithms so that the producers have enough time to adjust (both logistically and mentally) to the change. However, naive incremental updates may become unfair to the customers. Thus focusing on recommendations deployed on two-sided platforms, we formulate an ILP based online optimization to deploy changes incrementally in η steps, where we can ensure smooth transition of the exposure of items while guaranteeing a minimum utility for every customer. Evaluations over multiple real world datasets show that our proposed mechanism for platform updates can be efficient and fair to both the producers and the customers in two-sided platforms.

#24 Towards Comprehensive Recommender Systems: Time-Aware Unified Recommendations Based on Listwise Ranking of Implicit Cross-Network Data [PDF] [Copy] [Kimi]

Authors: Dilruk Perera ; Roger Zimmermann

The abundance of information in web applications make recommendation essential for users as well as applications. Despite the effectiveness of existing recommender systems, we find two major limitations that reduce their overall performance: (1) inability to provide timely recommendations for both new and existing users by considering the dynamic nature of user preferences, and (2) not fully optimized for the ranking task when using implicit feedback. Therefore, we propose a novel deep learning based unified cross-network solution to mitigate cold-start and data sparsity issues and provide timely recommendations for new and existing users. Furthermore, we consider the ranking problem under implicit feedback as a classification task, and propose a generic personalized listwise optimization criterion for implicit data to effectively rank a list of items. We illustrate our cross-network model using Twitter auxiliary information for recommendations on YouTube target network. Extensive comparisons against multiple time aware and cross-network baselines show that the proposed solution is superior in terms of accuracy, novelty and diversity. Furthermore, experiments conducted on the popular MovieLens dataset suggest that the proposed listwise ranking method outperforms existing state-of-the-art ranking techniques.

#25 Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation [PDF] [Copy] [Kimi]

Authors: Chenze Shao ; Jinchao Zhang ; Yang Feng ; Fandong Meng ; Jie Zhou

Non-Autoregressive Neural Machine Translation (NAT) achieves significant decoding speedup through generating target words independently and simultaneously. However, in the context of non-autoregressive translation, the word-level cross-entropy loss cannot model the target-side sequential dependency properly, leading to its weak correlation with the translation quality. As a result, NAT tends to generate influent translations with over-translation and under-translation errors. In this paper, we propose to train NAT to minimize the Bag-of-Ngrams (BoN) difference between the model output and the reference sentence. The bag-of-ngrams training objective is differentiable and can be efficiently calculated, which encourages NAT to capture the target-side sequential dependency and correlates well with the translation quality. We validate our approach on three translation tasks and show that our approach largely outperforms the NAT baseline by about 5.0 BLEU scores on WMT14 En↔De and about 2.5 BLEU scores on WMT16 En↔Ro.