Social and Information Networks

Date: Thu, 9 May 2024 | Total: 11

#1 Web Intelligence Journal in perspective: an analysis of its two decades trajectory [PDF] [Copy] [Kimi]

Authors: Diogenes Ademir Domingos ; Victor Emanuel Santos Moura ; Antonio Fernando Lavareda Jacob Junior ; Fabio Manoel Franca Lobato

The evolution of a thematic area undergoes various changes of perspective and adopts new theoretical approaches that arise from the interactions of the community and a wide range of social needs. The advent of digital technologies, such as social networks, underlines this factor by spreading knowledge and forging links between different communities. Web intelligence is now on the verge of raising questions that broaden the understanding of how artificial intelligence impacts the Web of People, Data, and Things, among other factors. To the best of our knowledge, there is no study that has conducted a longitudinal analysis of the evolution of this community. Thus, we investigate in this paper how Web intelligence has evolved in the last twenty years by carrying out a literature review and bibliometric analysis. Concerning the impact of this research study, increasing attention is devoted to determining which are the most influential papers in the community by referring to citation networks and discovering the most popular and pressing topics through a co-citation analysis and the keywords co-occurrence. The results obtained can guide the direction of new research projects in the area and update the scope and places of interest found in current trends and the relevant journals.

#2 Correlation and Autocorrelation of Data on Complex Networks [PDF] [Copy] [Kimi]

Author: Rudy Arthur

Networks where each node has one or more associated numerical values are common in applications. This work studies how summary statistics used for the analysis of spatial data can be applied to non-spatial networks for the purposes of exploratory data analysis. We focus primarily on Moran-type statistics and discuss measures of global autocorrelation, local autocorrelation and global correlation. We introduce null models based on fixing edges and permuting the data or fixing the data and permuting the edges. We demonstrate the use of these statistics on real and synthetic node-valued networks.

#3 Verified authors shape X/Twitter discursive communities [PDF] [Copy] [Kimi]

Authors: Stefano Guarino ; Ayoub Mounim ; Guido Caldarelli ; Fabio Saracco

Community detection algorithms try to extract a mesoscale structure from the available network data, generally avoiding any explicit assumption regarding the quantity and quality of information conveyed by specific sets of edges. In this paper, we show that the core of ideological/discursive communities on X/Twitter can be effectively identified by uncovering the most informative interactions in an authors-audience bipartite network through a maximum-entropy null model. The analysis is performed considering three X/Twitter datasets related to the main political events of 2022 in Italy, using as benchmarks four state-of-the-art algorithms - three descriptive, one inferential -, and manually annotating nearly 300 verified users based on their political affiliation. In terms of information content, the communities obtained with the entropy-based algorithm are comparable to those obtained with some of the benchmarks. However, such a methodology on the authors-audience bipartite network: uses just a small sample of the available data to identify the central users of each community; returns a neater partition of the user set in just a few, easy to interpret, communities; clusters well-known political figures in a way that better matches the political alliances when compared with the benchmarks. Our results provide an important insight into online debates, highlighting that online interaction networks are mostly shaped by the activity of a small set of users who enjoy public visibility even outside social media.

#4 Community detection in multi-layer bipartite networks [PDF] [Copy] [Kimi]

Author: Huan Qing

The problem of community detection in multi-layer undirected networks has received considerable attention in recent years. However, practical scenarios often involve multi-layer bipartite networks, where each layer consists of two distinct types of nodes. Existing community detection algorithms tailored for multi-layer undirected networks are not directly applicable to multi-layer bipartite networks. To address this challenge, this paper introduces a novel multi-layer degree-corrected stochastic co-block model specifically designed to capture the underlying community structure within multi-layer bipartite networks. Within this framework, we propose an efficient debiased spectral co-clustering algorithm for detecting nodes' communities. We establish the consistent estimation property of our proposed algorithm and demonstrate that an increased number of layers in bipartite networks improves the accuracy of community detection. Through extensive numerical experiments, we showcase the superior performance of our algorithm compared to existing methods. Additionally, we validate our algorithm by applying it to real-world multi-layer network datasets, yielding meaningful and insightful results.

#5 Understanding High-Order Network Structure using Permissible Walks on Attributed Hypergraphs [PDF] [Copy] [Kimi]

Authors: Enzo Battistella ; Sean English ; Robert Green ; Cliff Joslyn ; Evgeniya Lagoda ; Van Magnan ; Audun Myers ; Evan D. Nash ; Michael Robinson

Hypergraphs have been a recent focus of study in mathematical data science as a tool to understand complex networks with high-order connections. One question of particular relevance is how to leverage information carried in hypergraph attributions when doing walk-based techniques. In this work, we focus on a new generalization of a walk in a network that recovers previous approaches and allows for a description of permissible walks in hypergraphs. Permissible walk graphs are constructed by intersecting the attributed $s$-line graph of a hypergraph with a relation respecting graph. The attribution of the hypergraph's line graph commonly carries over information from categorical and temporal attributions of the original hypergraph. To demonstrate this approach on a temporally attributed example, we apply our framework to a Reddit data set composed of hyperedges as threads and authors as nodes where post times are tracked.

#6 "Community Guidelines Make this the Best Party on the Internet": An In-Depth Study of Online Platforms' Content Moderation Policies [PDF] [Copy] [Kimi]

Authors: Brennan Schaffner ; Arjun Nitin Bhagoji ; Siyuan Cheng ; Jacqueline Mei ; Jay L. Shen ; Grace Wang ; Marshini Chetty ; Nick Feamster ; Genevieve Lakier ; Chenhao Tan

Moderating user-generated content on online platforms is crucial for balancing user safety and freedom of speech. Particularly in the United States, platforms are not subject to legal constraints prescribing permissible content. Each platform has thus developed bespoke content moderation policies, but there is little work towards a comparative understanding of these policies across platforms and topics. This paper presents the first systematic study of these policies from the 43 largest online platforms hosting user-generated content, focusing on policies around copyright infringement, harmful speech, and misleading content. We build a custom web-scraper to obtain policy text and develop a unified annotation scheme to analyze the text for the presence of critical components. We find significant structural and compositional variation in policies across topics and platforms, with some variation attributable to disparate legal groundings. We lay the groundwork for future studies of ever-evolving content moderation policies and their impact on users.

#7 Network mutual information measures for graph similarity [PDF] [Copy] [Kimi]

Authors: Helcio Felippe ; Federico Battiston ; Alec Kirkley

A wide range of tasks in exploratory network analysis and machine learning, such as clustering network populations or identifying anomalies in temporal graph streams, require a measure of the similarity between two graphs. To provide a meaningful data summary for downstream scientific analyses, the graph similarity measures used in these unsupervised settings must be principled, interpretable, and capable of distinguishing meaningful overlapping network structure from statistical noise at different scales of interest. Here we derive a family of graph mutual information measures that satisfy these criteria and are constructed using only fundamental information theoretic principles. Our measures capture the information shared among networks according to different encodings of their structural information, with our mesoscale mutual information measure allowing for network comparison under any specified network coarse-graining. We test our measures in a range of applications on real and synthetic network data, finding that they effectively highlight intuitive aspects of network similarity across scales in a variety of systems.

#8 Combining Rollout Designs and Clustering for Causal Inference under Low-order Interference [PDF] [Copy] [Kimi]

Authors: Mayleen Cortez-Rodriguez ; Matthew Eichhorn ; Christina Lee Yu

Estimating causal effects under interference is pertinent to many real-world settings. However, the true interference network may be unknown to the practitioner, precluding many existing techniques that leverage this information. A recent line of work with low-order potential outcomes models uses staggered rollout designs to obtain unbiased estimators that require no network information. However, their use of polynomial extrapolation can lead to prohibitively high variance. To address this, we propose a two-stage experimental design that restricts treatment rollout to a sub-population. We analyze the bias and variance of an interpolation-style estimator under this experimental design. Through numerical simulations, we explore the trade-off between the error attributable to the subsampling of our experimental design and the extrapolation of the estimator. Under low-order interactions models with degree greater than 1, the proposed design greatly reduces the error of the polynomial interpolation estimator, such that it outperforms baseline estimators, especially when the treatment probability is small.

#9 Adversarial Threats to Automatic Modulation Open Set Recognition in Wireless Networks [PDF] [Copy] [Kimi]

Authors: Yandie Yang ; Sicheng Zhang ; Kuixian Li ; Qiao Tian ; Yun Lin

Automatic Modulation Open Set Recognition (AMOSR) is a crucial technological approach for cognitive radio communications, wireless spectrum management, and interference monitoring within wireless networks. Numerous studies have shown that AMR is highly susceptible to minimal perturbations carefully designed by malicious attackers, leading to misclassification of signals. However, the adversarial security issue of AMOSR has not yet been explored. This paper adopts the perspective of attackers and proposes an Open Set Adversarial Attack (OSAttack), aiming at investigating the adversarial vulnerabilities of various AMOSR methods. Initially, an adversarial threat model for AMOSR scenarios is established. Subsequently, by analyzing the decision criteria of both discriminative and generative open set recognition, OSFGSM and OSPGD are proposed to reduce the performance of AMOSR. Finally, the influence of OSAttack on AMOSR is evaluated utilizing a range of qualitative and quantitative indicators. The results indicate that despite the increased resistance of AMOSR models to conventional interference signals, they remain vulnerable to attacks by adversarial examples.

#10 Urban Boundary Delineation from Commuting Data with Bayesian Stochastic Blockmodeling: Scale, Contiguity, and Hierarchy [PDF] [Copy] [Kimi]

Authors: Sebastian Morel-Balbi ; Alec Kirkley

A common method for delineating urban and suburban boundaries is to identify clusters of spatial units that are highly interconnected in a network of commuting flows, each cluster signaling a cohesive economic submarket. It is critical that the clustering methods employed for this task are principled and free of unnecessary tunable parameters to avoid unwanted inductive biases while remaining scalable for high resolution mobility networks. Here we systematically assess the benefits and limitations of a wide array of Stochastic Block Models (SBMs)$\unicode{x2014}$a family of principled, nonparametric models for identifying clusters in networks$\unicode{x2014}$for delineating urban spatial boundaries with commuting data. We find that the data compression capability and relative performance of different SBM variants heavily depends on the spatial extent of the commuting network, its aggregation scale, and the method used for weighting network edges. We also construct a new measure to assess the degree to which community detection algorithms find spatially contiguous partitions, finding that traditional SBMs may produce substantial spatial discontiguities that make them challenging to use in general for urban boundary delineation. We propose a fast nonparametric regionalization algorithm that can alleviate this issue, achieving data compression close to that of unconstrained SBM models while ensuring spatial contiguity, benefiting from a deterministic optimization procedure, and being generalizable to a wide range of community detection objective functions.

#11 Hypergraph-enhanced Dual Semi-supervised Graph Classification [PDF1] [Copy] [Kimi]

Authors: Wei Ju ; Zhengyang Mao ; Siyu Yi ; Yifang Qin ; Yiyang Gu ; Zhiping Xiao ; Yifan Wang ; Xiao Luo ; Ming Zhang

In this paper, we study semi-supervised graph classification, which aims at accurately predicting the categories of graphs in scenarios with limited labeled graphs and abundant unlabeled graphs. Despite the promising capability of graph neural networks (GNNs), they typically require a large number of costly labeled graphs, while a wealth of unlabeled graphs fail to be effectively utilized. Moreover, GNNs are inherently limited to encoding local neighborhood information using message-passing mechanisms, thus lacking the ability to model higher-order dependencies among nodes. To tackle these challenges, we propose a Hypergraph-Enhanced DuAL framework named HEAL for semi-supervised graph classification, which captures graph semantics from the perspective of the hypergraph and the line graph, respectively. Specifically, to better explore the higher-order relationships among nodes, we design a hypergraph structure learning to adaptively learn complex node dependencies beyond pairwise relations. Meanwhile, based on the learned hypergraph, we introduce a line graph to capture the interaction between hyperedges, thereby better mining the underlying semantic structures. Finally, we develop a relational consistency learning to facilitate knowledge transfer between the two branches and provide better mutual guidance. Extensive experiments on real-world graph datasets verify the effectiveness of the proposed method against existing state-of-the-art methods.