IJCAI.2022 - Multidisciplinary Topics and Applications

| Total: 32

#1 Subsequence-based Graph Routing Network for Capturing Multiple Risk Propagation Processes [PDF] [Copy] [Kimi] [REL]

Authors: Rui Cheng, Qing Li

In finance, the risk of an entity depends not only on its historical information but also on the risk propagated by its related peers. Pilot studies rely on Graph Neural Networks (GNNs) to model this risk propagation, where each entity is treated as a node and represented by its time-series information. However, conventional GNNs are constrained by their unified messaging mechanism with an assumption that the risk of a given entity only propagates to its related peers with the same time lag and has the same effect, which is against the ground truth. In this study, we propose the subsequence-based graph routing network (S-GRN) for capturing the variant risk propagation processes among different time-series represented entities. In S-GRN, the messaging mechanism between each node pair is dynamically and independently selected from multiple messaging mechanisms based on the dependencies of variant subsequence patterns. The S-GRN is extensively evaluated on two synthetic tasks and three real-world datasets and demonstrates state-of-the-art performance.


#2 3E-Solver: An Effortless, Easy-to-Update, and End-to-End Solver with Semi-Supervised Learning for Breaking Text-Based Captchas [PDF] [Copy] [Kimi] [REL]

Authors: Xianwen Deng, Ruijie Zhao, Yanhao Wang, Libo Chen, Yijun Wang, Zhi Xue

Text-based captchas are the most widely used security mechanism currently. Due to the limitations and specificity of the segmentation algorithm, the early segmentation-based attack method has been unable to deal with the current captchas with newly introduced security features (e.g., occluding lines and overlapping). Recently, some works have designed captcha solvers based on deep learning methods with powerful feature extraction capabilities, which have greater generality and higher accuracy. However, these works still suffer from two main intrinsic limitations: (1) many labor costs are required to label the training data, and (2) the solver cannot be updated with unlabeled data to recognize captchas more accurately. In this paper, we present a novel solver using improved FixMatch for semi-supervised captcha recognition to tackle these problems. Specifically, we first build an end-to-end baseline model to effectively break text-based captchas by leveraging encoder-decoder architecture and attention mechanism. Then we construct our solver with a few labeled samples and many unlabeled samples by improved FixMatch, which introduces teacher forcing, adaptive batch normalization, and consistency loss to achieve more effective training. Experiment results show that our solver outperforms state-of-the-arts by a large margin on current captcha schemes. We hope that our work can help security experts to revisit the design and usability of text-based captchas. The source code of this work is available at https://github.com/SJTU-dxw/3E-Solver-CAPTCHA.


#3 Placing Green Bridges Optimally, with Habitats Inducing Cycles [PDF] [Copy] [Kimi] [REL]

Authors: Maike Herkenrath, Till Fluschnik, Francesco Grothe, Leon Kellerhals

Choosing the placement of wildlife crossings (i.e., green bridges) to reconnect animal species' fragmented habitats is among the 17 goals towards sustainable development by the UN. We consider the following established model: Given a graph whose vertices represent the fragmented habitat areas and whose weighted edges represent possible green bridge locations, as well as the habitable vertex set for each species, find the cheapest set of edges such that each species' habitat is connected. We study this problem from a theoretical (algorithms and complexity) and an experimental perspective, while focusing on the case where habitats induce cycles. We prove that the NP-hardness persists in this case even if the graph structure is restricted. If the habitats additionally induce faces in plane graphs however, the problem becomes efficiently solvable. In our empirical evaluation we compare this algorithm as well as ILP formulations for more general variants and an approximation algorithm with another. Our evaluation underlines that each specialization is beneficial in terms of running time, whereas the approximation provides highly competitive solutions in practice.


#4 Membership Inference via Backdooring [PDF] [Copy] [Kimi] [REL]

Authors: Hongsheng Hu, Zoran Salčić, Gillian Dobbie, Jinjun Chen, Lichao Sun, Xuyun Zhang

Recently issued data privacy regulations like GDPR (General Data Protection Regulation) grant individuals the right to be forgotten. In the context of machine learning, this requires a model to forget about a training data sample if requested by the data owner (i.e., machine unlearning). As an essential step prior to machine unlearning, it is still a challenge for a data owner to tell whether or not her data have been used by an unauthorized party to train a machine learning model. Membership inference is a recently emerging technique to identify whether a data sample was used to train a target model, and seems to be a promising solution to this challenge. However, straightforward adoption of existing membership inference approaches fails to address the challenge effectively due to being originally designed for attacking membership privacy and suffering from several severe limitations such as low inference accuracy on well-generalized models. In this paper, we propose a novel membership inference approach inspired by the backdoor technology to address the said challenge. Specifically, our approach of Membership Inference via Backdooring (MIB) leverages the key observation that a backdoored model behaves very differently from a clean model when predicting on deliberately marked samples created by a data owner. Appealingly, MIB requires data owners' marking a small number of samples for membership inference and only black-box access to the target model, with theoretical guarantees for inference results. We perform extensive experiments on various datasets and deep neural network architectures, and the results validate the efficacy of our approach, e.g., marking only 0.1% of the training dataset is practically sufficient for effective membership inference.


#5 A Universal PINNs Method for Solving Partial Differential Equations with a Point Source [PDF] [Copy] [Kimi] [REL]

Authors: Xiang Huang, Hongsheng Liu, Beiji Shi, Zidong Wang, Kang Yang, Yang Li, Min Wang, Haotian Chu, Jing Zhou, Fan Yu, Bei Hua, Bin Dong, Lei Chen

In recent years, deep learning technology has been used to solve partial differential equations (PDEs), among which the physics-informed neural networks (PINNs)method emerges to be a promising method for solving both forward and inverse PDE problems. PDEs with a point source that is expressed as a Dirac delta function in the governing equations are mathematical models of many physical processes. However, they cannot be solved directly by conventional PINNs method due to the singularity brought by the Dirac delta function. In this paper, we propose a universal solution to tackle this problem by proposing three novel techniques. Firstly the Dirac delta function is modeled as a continuous probability density function to eliminate the singularity at the point source; secondly a lower bound constrained uncertainty weighting algorithm is proposed to balance the physics-informed loss terms of point source area and the remaining areas; and thirdly a multi-scale deep neural network with periodic activation function is used to improve the accuracy and convergence speed. We evaluate the proposed method with three representative PDEs, and the experimental results show that our method outperforms existing deep learning based methods with respect to the accuracy, the efficiency and the versatility.


#6 A Polynomial-time Decentralised Algorithm for Coordinated Management of Multiple Intersections [PDF] [Copy] [Kimi] [REL]

Authors: Tatsuya Iwase, Sebastian Stein, Enrico H. Gerding, Archie Chapman

Autonomous intersection management has the potential to reduce road traffic congestion and energy consumption. To realize this potential, efficient algorithms are needed. However, most existing studies locally optimize one intersection at a time, and this can cause negative externalities on the traffic network as a whole. Here, we focus on coordinating multiple intersections, and formulate the problem as a distributed constraint optimisation problem (DCOP). We consider three utility design approaches that trade off efficiency and fairness. Our polynomial-time algorithm for coordinating multiple intersections reduces the traffic delay by about 41 percentage points compared to independent single intersection management approaches.


#7 Multi-Agent Reinforcement Learning for Traffic Signal Control through Universal Communication Method [PDF] [Copy] [Kimi] [REL]

Authors: Qize Jiang, Minhao Qin, Shengmin Shi, Weiwei Sun, Baihua Zheng

How to coordinate the communication among intersections effectively in real complex traffic scenarios with multi-intersection is challenging. Existing approaches only enable the communication in a heuristic manner without considering the content/importance of information to be shared. In this paper, we propose a universal communication form UniComm between intersections. UniComm embeds massive observations collected at one agent into crucial predictions of their impact on its neighbors, which improves the communication efficiency and is universal across existing methods. We also propose a concise network UniLight to make full use of communications enabled by UniComm. Experimental results on real datasets demonstrate that UniComm universally improves the performance of existing state-of-the-art methods, and UniLight significantly outperforms existing methods on a wide range of traffic situations. Source codes are available at https://github.com/zyr17/UniLight.


#8 Cumulative Stay-time Representation for Electronic Health Records in Medical Event Time Prediction [PDF] [Copy] [Kimi] [REL]

Authors: Takayuki Katsuki, Kohei Miyaguchi, Akira Koseki, Toshiya Iwamori, Ryosuke Yanagiya, Atsushi Suzuki

We address the problem of predicting when a disease will develop, i.e., medical event time (MET), from a patient's electronic health record (EHR). The MET of non-communicable diseases like diabetes is highly correlated to cumulative health conditions, more specifically, how much time the patient spent with specific health conditions in the past. The common time-series representation is indirect in extracting such information from EHR because it focuses on detailed dependencies between values in successive observations, not cumulative information. We propose a novel data representation for EHR called cumulative stay-time representation (CTR), which directly models such cumulative health conditions. We derive a trainable construction of CTR based on neural networks that has the flexibility to fit the target data and scalability to handle high-dimensional EHR. Numerical experiments using synthetic and real-world datasets demonstrate that CTR alone achieves a high prediction performance, and it enhances the performance of existing models when combined with them.


#9 Self-Supervised Learning with Attention-based Latent Signal Augmentation for Sleep Staging with Limited Labeled Data [PDF] [Copy] [Kimi] [REL]

Authors: Harim Lee, Eunseon Seong, Dong-Kyu Chae

Sleep staging is an important task that enables sleep quality assessment and disorder diagnosis. Due to dependency on manually labeled data, many researches have turned from supervised approaches to self-supervised learning (SSL) for sleep staging. While existing SSL methods have made significant progress in terms of its comparable performance to supervised methods, there are still some limitations. Contrastive learning could potentially lead to false negative pair assignments in sleep signal data. Moreover, existing data augmentation techniques directly modify the original signal data, making it likely to lose important information. To mitigate these issues, we propose Self-Supervised Learning with Attention-aided Positive Pairs (SSLAPP). Instead of the contrastive learning, SSLAPP carefully draws high-quality positive pairs and exploits them in representation learning. Here, we propose attention-based latent signal augmentation, which plays a key role by capturing important features without losing valuable signal information. Experimental results show that our proposed method achieves state-of-the-art performance in sleep stage classification with limited labeled data. The code is available at: https://github.com/DILAB-HYU/SSLAPP


#10 Learning Curricula for Humans: An Empirical Study with Puzzles from The Witness [PDF] [Copy] [Kimi] [REL]

Authors: Levi H.S. Lelis, João G.G.V. Nova, Eugene Chen, Nathan R. Sturtevant, Carrie Demmans Epp, Michael Bowling

The combination of tree search and neural networks has achieved super-human performance in challenging domains. We are interested in transferring to humans the knowledge these learning systems generate. We hypothesize the process in which neural-guided tree search algorithms learn how to solve a set of problems can be used to generate curricula for helping human learners. In this paper we show how the Bootstrap learning system can be modified to learn curricula for humans in a puzzle domain. We evaluate our system in two curriculum learning settings. First, given a small set of problem instances, our system orders the instances to ease the learning process of human learners. Second, given a large set of problem instances, our system returns a small ordered subset of the initial set that can be presented to human learners. We evaluate our curricula with a user study where participants learn how to solve a class of puzzles from the game `The Witness.' The user-study results suggest one of the curricula our system generates compares favorably with simple baselines and is competitive with the curriculum from the original `The Witness' game in terms of user retention and effort.


#11 Transformer-based Objective-reinforced Generative Adversarial Network to Generate Desired Molecules [PDF1] [Copy] [Kimi] [REL]

Authors: Chen Li, Chikashige Yamanaka, Kazuma Kaitoh, Yoshihiro Yamanishi

Deep generative models of sequence-structure data have attracted widespread attention in drug discovery. However, such models cannot fully extract the semantic features of molecules from sequential representations. Moreover, mode collapse reduces the diversity of the generated molecules. This paper proposes a transformer-based objective-reinforced generative adversarial network (TransORGAN) to generate molecules. TransORGAN leverages a transformer architecture as a generator and uses a stochastic policy gradient for reinforcement learning to generate plausible molecules with rich semantic features. The discriminator grants rewards that guide the policy update of the generator, while an objective-reinforced penalty encourages the generation of diverse molecules. Experiments were performed using the ZINC chemical dataset, and the results demonstrated the usefulness of TransORGAN in terms of uniqueness, novelty, and diversity of the generated molecules.


#12 Towards Controlling the Transmission of Diseases: Continuous Exposure Discovery over Massive-Scale Moving Objects [PDF] [Copy] [Kimi] [REL]

Authors: Ke Li, Lisi Chen, Shuo Shang, Haiyan Wang, Yang Liu, Panos Kalnis, Bin Yao

Infectious diseases have been recognized as major public health concerns for decades. Close contact discovery is playing an indispensable role in preventing epidemic transmission. In this light, we study the continuous exposure search problem: Given a collection of moving objects and a collection of moving queries, we continuously discover all objects that have been directly and indirectly exposed to at least one query over a period of time. Our problem targets a variety of applications, including but not limited to disease control, epidemic pre-warning, information spreading, and co-movement mining. To answer this problem, we develop an exact group processing algorithm with optimization strategies. Further, we propose an approximate algorithm that substantially improves the efficiency without false dismissal. Extensive experiments offer insight into effectiveness and efficiency of our proposed algorithms.


#13 Distilling Governing Laws and Source Input for Dynamical Systems from Videos [PDF] [Copy] [Kimi] [REL]

Authors: Lele Luan, Yang Liu, Hao Sun

Distilling interpretable physical laws from videos has led to expanded interest in the computer vision community recently thanks to the advances in deep learning, but still remains a great challenge. This paper introduces an end-to-end unsupervised deep learning framework to uncover the explicit governing equations of dynamics presented by moving object(s), based on recorded videos. Instead in the pixel (spatial) coordinate system of image space, the physical law is modeled in a regressed underlying physical coordinate system where the physical states follow potential explicit governing equations. A numerical integrator-based sparse regression module is designed and serves as a physical constraint to the autoencoder and coordinate system regression, and, in the meanwhile, uncover the parsimonious closed-form governing equations from the learned physical states. Experiments on simulated dynamical scenes show that the proposed method is able to distill closed-form governing equations and simultaneously identify unknown excitation input for several dynamical systems recorded by videos, which fills in the gap in literature where no existing methods are available and applicable for solving this type of problem.


#14 Monolith to Microservices: Representing Application Software through Heterogeneous Graph Neural Network [PDF] [Copy] [Kimi] [REL]

Authors: Alex Mathai, Sambaran Bandyopadhyay, Utkarsh Desai, Srikanth Tamilselvam

Monolithic software encapsulates all functional capabilities into a single deployable unit. But managing it becomes harder as the demand for new functionalities grow. Microservice architecture is seen as an alternative as it advocates building an application through a set of loosely coupled small services wherein each service owns a single functional responsibility. But the challenges associated with the separation of functional modules, slows down the migration of a monolithic code into microservices. In this work, we propose a representation learning based solution to tackle this problem. We use a heterogeneous graph to jointly represent software artifacts (like programs and resources) and the different relationships they share (function calls, inheritance, etc.), and perform a constraint-based clustering through a novel heterogeneous graph neural network. Experimental studies show that our approach is effective on monoliths of different types.


#15 Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution [PDF] [Copy] [Kimi] [REL]

Authors: Feiyang Pan, Tongzhe Zhang, Ling Luo, Jia He, Shuoling Liu

Optimal execution is a sequential decision-making problem for cost-saving in algorithmic trading. Studies have found that reinforcement learning (RL) can help decide the order-splitting sizes. However, a problem remains unsolved: how to place limit orders at appropriate limit prices? The key challenge lies in the ``continuous-discrete duality'' of the action space. On the one hand, the continuous action space using percentage changes in prices is preferred for generalization. On the other hand, the trader eventually needs to choose limit prices discretely due to the existence of the tick size, which requires specialization for every single stock with different characteristics (e.g., the liquidity and the price range). So we need continuous control for generalization and discrete control for specialization. To this end, we propose a hybrid RL method to combine the advantages of both of them. We first use a continuous control agent to scope an action subset, then deploy a fine-grained agent to choose a specific limit price. Extensive experiments show that our method has higher sample efficiency and better training stability than existing RL algorithms and significantly outperforms previous learning-based methods for order execution.


#16 Communicative Subgraph Representation Learning for Multi-Relational Inductive Drug-Gene Interaction Prediction [PDF] [Copy] [Kimi] [REL]

Authors: Jiahua Rao, Shuangjia Zheng, Sijie Mai, Yuedong Yang

Illuminating the interconnections between drugs and genes is an important topic in drug development and precision medicine. Currently, computational predictions of drug-gene interactions mainly focus on the binding interactions without considering other relation types like agonist, antagonist, etc. In addition, existing methods either heavily rely on high-quality domain features or are intrinsically transductive, which limits the capacity of models to generalize to drugs/genes that lack external information or are unseen during the training process. To address these problems, we propose a novel Communicative Subgraph representation learning for Multi-relational Inductive drug-Gene interactions prediction (CoSMIG), where the predictions of drug-gene relations are made through subgraph patterns, and thus are naturally inductive for unseen drugs/genes without retraining or utilizing external domain features. Moreover, the model strengthened the relations on the drug-gene graph through a communicative message passing mechanism. To evaluate our method, we compiled two new benchmark datasets from DrugBank and DGIdb. The comprehensive experiments on the two datasets showed that our method outperformed state-of-the-art baselines in the transductive scenarios and achieved superior performance in the inductive ones. Further experimental analysis including LINCS experimental validation and literature verification also demonstrated the value of our model.


#17 FOGS: First-Order Gradient Supervision with Learning-based Graph for Traffic Flow Forecasting [PDF] [Copy] [Kimi2] [REL]

Authors: Xuan Rao, Hao Wang, Liang Zhang, Jing Li, Shuo Shang, Peng Han

Traffic flow forecasting plays a vital role in the transportation domain. Existing studies usually manually construct correlation graphs and design sophisticated models for learning spatial and temporal features to predict future traffic states. However, manually constructed correlation graphs cannot accurately extract the complex patterns hidden in the traffic data. In addition, it is challenging for the prediction model to fit traffic data due to its irregularly-shaped distribution. To solve the above-mentioned problems, in this paper, we propose a novel learning-based method to learn a spatial-temporal correlation graph, which could make good use of the traffic flow data. Moreover, we propose First-Order Gradient Supervision (FOGS), a novel method for traffic flow forecasting. FOGS utilizes first-order gradients, rather than specific flows, to train prediction model, which effectively avoids the problem of fitting irregularly-shaped distributions. Comprehensive numerical evaluations on four real-world datasets reveal that the proposed methods achieve state-of-the-art performance and significantly outperform the benchmarks.


#18 Offline Vehicle Routing Problem with Online Bookings: A Novel Problem Formulation with Applications to Paratransit [PDF] [Copy] [Kimi] [REL]

Authors: Amutheezan Sivagnanam, Salah Uddin Kadir, Ayan Mukhopadhyay, Philip Pugliese, Abhishek Dubey, Samitha Samaranayake, Aron Laszka

Vehicle routing problems (VRPs) can be divided into two major categories: offline VRPs, which consider a given set of trip requests to be served, and online VRPs, which consider requests as they arrive in real-time. Based on discussions with public transit agencies, we identify a real-world problem that is not addressed by existing formulations: booking trips with flexible pickup windows (e.g., 3 hours) in advance (e.g., the day before) and confirming tight pickup windows (e.g., 30 minutes) at the time of booking. Such a service model is often required in paratransit service settings, where passengers typically book trips for the next day over the phone. To address this gap between offline and online problems, we introduce a novel formulation, the offline vehicle routing problem with online bookings. This problem is very challenging computationally since it faces the complexity of considering large sets of requests—similar to offline VRPs—but must abide by strict constraints on running time—similar to online VRPs. To solve this problem, we propose a novel computational approach, which combines an anytime algorithm with a learning-based policy for real-time decisions. Based on a paratransit dataset obtained from the public transit agency of Chattanooga, TN, we demonstrate that our novel formulation and computational approach lead to significantly better outcomes in this setting than existing algorithms.


#19 Local Differential Privacy Meets Computational Social Choice - Resilience under Voter Deletion [PDF] [Copy] [Kimi] [REL]

Authors: Liangde Tao, Lin Chen, Lei Xu, Weidong Shi

The resilience of a voting system has been a central topic in computational social choice. Many voting rules, like plurality, are shown to be vulnerable as the attacker can target specific voters to manipulate the result. What if a local differential privacy (LDP) mechanism is adopted such that the true preference of a voter is never revealed in pre-election polls? In this case, the attacker can only infer stochastic information about a voter's true preference, and this may cause the manipulation of the electoral result significantly harder. The goal of this paper is to provide a quantitative study on the effect of adopting LDP mechanisms on a voting system. We introduce the metric PoLDP (power of LDP) that quantitatively measures the difference between the attacker's manipulation cost under LDP mechanisms and that without LDP mechanisms. The larger PoLDP is, the more robustness LDP mechanisms can add to a voting system. We give a full characterization of PoLDP for the voting system with plurality rule and provide general guidance towards the application of LDP mechanisms.


#20 Private Stochastic Convex Optimization and Sparse Learning with Heavy-tailed Data Revisited [PDF] [Copy] [Kimi] [REL]

Authors: Youming Tao, Yulian Wu, Xiuzhen Cheng, Di Wang

In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) with heavy-tailed data, where the gradient of the loss function has bounded moments. Instead of the case where the loss function is Lipschitz or each coordinate of the gradient has bounded second moment studied previously, we consider a relaxed scenario where each coordinate of the gradient only has bounded (1+v)-th moment with some v∈(0, 1]. Firstly, we start from the one dimensional private mean estimation for heavy-tailed distributions. We propose a novel robust and private mean estimator which is optimal. Based on its idea, we then extend to the general d-dimensional space and study DP-SCO with general convex and strongly convex loss functions. We also provide lower bounds for these two classes of loss under our setting and show that our upper bounds are optimal up to a factor of O(Poly(d)). To address the high dimensionality issue, we also study DP-SCO with heavy-tailed gradient under some sparsity constraint (DP sparse learning). We propose a new method and show it is also optimal up to a factor of O(s*), where s* is the underlying sparsity of the constraint.


#21 Exploring the Vulnerability of Deep Reinforcement Learning-based Emergency Control for Low Carbon Power Systems [PDF] [Copy] [Kimi] [REL]

Authors: Xu Wan, Lanting Zeng, Mingyang Sun

Decarbonization of global power systems significantly increases the operational uncertainty and modeling complexity that drive the necessity of widely exploiting cutting-edge Deep Reinforcement Learning (DRL) technologies to realize adaptive and real-time emergency control, which is the last resort for system stability and resiliency. The vulnerability of the DRL-based emergency control scheme may lead to severe real-world security issues if it can not be fully explored before implementing it practically. To this end, this is the first work that comprehensively investigates adversarial attacks and defense mechanisms for DRL-based power system emergency control. In particular, recovery-targeted (RT) adversarial attacks are designed for gradient-based approaches, aiming to dramatically degrade the effectiveness of the conducted emergency control actions to prevent the system from restoring to a stable state. Furthermore, the corresponding robust defense (RD) mechanisms are proposed to actively modify the observations based on the distances of sequential states. Experiments are conducted based on the standard IEEE reliability test system, and the results show that security risks indeed exist in the state-of-the-art DRL-based power system emergency control models. The effectiveness, stealthiness, instantaneity, and transferability of the proposed attacks and defense mechanisms are demonstrated with both white-box and black-box settings.


#22 Heterogeneous Interactive Snapshot Network for Review-Enhanced Stock Profiling and Recommendation [PDF2] [Copy] [Kimi] [REL]

Authors: Heyuan Wang, Tengjiao Wang, Shun Li, Shijie Guan, Jiayi Zheng, Wei Chen

Stock recommendation plays a critical role in modern quantitative trading. The large volumes of social media information such as investment reviews that delegate emotion-driven factors, together with price technical indicators formulate a “snapshot” of the evolving stock market profile. However, previous studies usually model the temporal trajectories of price and media modalities separately while losing their interrelated influences. Moreover, they mainly extract review semantics via sequential or attentive models, whereas the rich text associated knowledge is largely neglected. In this paper, we propose a novel heterogeneous interactive snapshot network for stock profiling and recommendation. We model investment reviews in each snapshot as a heterogeneous document graph, and develop a flexible hierarchical attentive propagation framework to capture fine-grained proximity features. Further, to learn stock embedding for ranking, we introduce a novel twins-GRU method, which tightly couples the media and price parallel sequences in a cross-interactive fashion to catch dynamic dependencies between successive snapshots. Our approach excels state-of-the-arts over 7.6% in terms of cumulative and risk-adjusted returns in trading simulations on both English and Chinese benchmarks.


#23 Adaptive Long-Short Pattern Transformer for Stock Investment Selection [PDF1] [Copy] [Kimi] [REL]

Authors: Heyuan Wang, Tengjiao Wang, Shun Li, Jiayi Zheng, Shijie Guan, Wei Chen

Stock investment selection is a hard issue in the Fintech field due to non-stationary dynamics and complex market interdependencies. Existing studies are mostly based on RNNs, which struggle to capture interactive information among fine granular volatility patterns. Besides, they either treat stocks as isolated, or presuppose a fixed graph structure heavily relying on prior domain knowledge. In this paper, we propose a novel Adaptive Long-Short Pattern Transformer (ALSP-TF) for stock ranking in terms of expected returns. Specifically, we overcome the limitations of canonical self-attention including context and position agnostic, with two additional capacities: (i) fine-grained pattern distiller to contextualize queries and keys based on localized feature scales, and (ii) time-adaptive modulator to let the dependency modeling among pattern pairs sensitive to different time intervals. Attention heads in stacked layers gradually harvest short- and long-term transition traits, spontaneously boosting the diversity of representations. Moreover, we devise a graph self-supervised regularization, which helps automatically assimilate the collective synergy of stocks and improve the generalization ability of overall model. Experiments on three exchange market datasets show ALSP-TF’s superiority over state-of-the-art stock forecast methods.


#24 Bridging the Gap between Reality and Ideality of Entity Matching: A Revisting and Benchmark Re-Constrcution [PDF] [Copy] [Kimi] [REL]

Authors: Tianshu Wang, Hongyu Lin, Cheng Fu, Xianpei Han, Le Sun, Feiyu Xiong, Hui Chen, Minlong Lu, Xiuwen Zhu

Entity matching (EM) is the most critical step for entity resolution (ER). While current deep learning-based methods achieve very impressive performance on standard EM benchmarks, their real-world application performance is much frustrating. In this paper, we highlight that such the gap between reality and ideality stems from the unreasonable benchmark construction process, which is inconsistent with the nature of entity matching and therefore leads to biased evaluations of current EM approaches. To this end, we build a new EM corpus and re-construct EM benchmarks to challenge critical assumptions implicit in the previous benchmark construction process by step-wisely changing the restricted entities, balanced labels, and single-modal records in previous benchmarks into open entities, imbalanced labels, and multi-modal records in an open environment. Experimental results demonstrate that the assumptions made in the previous benchmark construction process are not coincidental with the open environment, which conceal the main challenges of the task and therefore significantly overestimate the current progress of entity matching. The constructed benchmarks and code are publicly released at https://github.com/tshu-w/ember.


#25 Learnability of Competitive Threshold Models [PDF] [Copy] [Kimi] [REL]

Authors: Yifan Wang, Guangmo Tong

Modeling the spread of social contagions is central to various applications in social computing. In this paper, we study the learnability of the competitive threshold model from a theoretical perspective. We demonstrate how competitive threshold models can be seamlessly simulated by artificial neural networks with finite VC dimensions, which enables analytical sample complexity and generalization bounds. Based on the proposed hypothesis space, we design efficient algorithms under the empirical risk minimization scheme. The theoretical insights are finally translated into practical and explainable modeling methods, the effectiveness of which is verified through a sanity check over a few synthetic and real datasets. The experimental results promisingly show that our method enjoys a decent performance without using excessive data points, outperforming off-the-shelf methods.