Methodology

2025-06-16 | | Total: 21

#1 Modeling Complex Life Systems: Bayesian Inference for Weibull Failure Times Using Adaptive MCMC [PDF] [Copy] [Kimi] [REL]

Authors: Tobias Oketch, Mohammad Sepehrifar

This research develops a Bayesian framework for analyzing failure times using the Weibull distribution, addressing challenges in prior selection due to the lack of conjugate priors and multi-dimensional sufficient statistics. We propose an adaptive semi-parametric MCMC algorithm for lifetime data analysis, employing a hierarchical Bayesian model and the No-U-Turn Sampler (NUTS) in STAN. Twenty-four combinations of prior distributions are evaluated, with a noninformative LogNormal hyper-prior ensuring flexibility. A simulation study of seventy-two datasets with varying structures compares MCMC and classical methods, identifying optimal priors for Bayesian regularization. The approach effectively handles the Increasing Hazard Rate (IHR) and Decreasing Hazard Rate (DHR) scenarios. Finally, we demonstrate the algorithm's utility by predicting the remaining lifetime of prostate cancer patients, showcasing its practical application. This work advances Bayesian methodologies for modeling complex life systems and testing processes.

Subjects: Methodology , Computation

Publish: 2025-06-13 16:58:31 UTC


#2 Simultaneous hypothesis testing for comparing many functional means [PDF] [Copy] [Kimi] [REL]

Authors: Colin Decker, Dehan Kong, Stanislav Volgushev

Data with multiple functional recordings at each observational unit are increasingly common in various fields including medical imaging and environmental sciences. To conduct inference for such observations, we develop a paired two-sample test that allows to simultaneously compare the means of many functional observations while maintaining family-wise error rate control. We explicitly allow the number of functional recordings to increase, potentially much faster than the sample size. Our test is fully functional and does not rely on dimension reduction or functional PCA type approaches or the choice of tuning parameters. To provide a theoretical justification for the proposed procedure, we develop a number of new anti-concentration and Gaussian approximation results for maxima of $L^2$ statistics which might be of independent interest. The methodology is illustrated on the task-related cortical surface functional magnetic resonance imaging data from Human Connectome Project.

Subjects: Methodology , Statistics Theory , Applications

Publish: 2025-06-13 15:37:10 UTC


#3 Bias and Identifiability in the Bounded Confidence Model [PDF] [Copy] [Kimi] [REL]

Authors: Claudio Borile, Jacopo Lenti, Valentina Ghidini, Corrado Monti, Gianmarco De Francisci Morales

Opinion dynamics models such as the bounded confidence models (BCMs) describe how a population can reach consensus, fragmentation, or polarization, depending on a few parameters. Connecting such models to real-world data could help understanding such phenomena, testing model assumptions. To this end, estimation of model parameters is a key aspect, and maximum likelihood estimation provides a principled way to tackle it. Here, our goal is to outline the properties of statistical estimators of the two key BCM parameters: the confidence bound and the convergence rate. We find that their maximum likelihood estimators present different characteristics: the one for the confidence bound presents a small-sample bias but is consistent, while the estimator of the convergence rate shows a persistent bias. Moreover, the joint parameter estimation is affected by identifiability issues for specific regions of the parameter space, as several local maxima are present in the likelihood function. Our results show how the analysis of the likelihood function is a fruitful approach for better understanding the pitfalls and possibilities of estimating the parameters of opinion dynamics models, and more in general, agent-based models, and for offering formal guarantees for their calibration.

Subjects: Methodology , Computers and Society , Machine Learning , Physics and Society

Publish: 2025-06-13 13:04:29 UTC


#4 Methodological Advances and Challenges in Indirect Treatment Comparisons: A Review of International Guidelines and HAS TC Case Studies [PDF] [Copy] [Kimi] [REL]

Authors: Matthias Monnereau, Ana Jarne, Axel Benoist, Clémence Fradet, Maurice Perol, Thomas Filleron, Louise Baschet

To evaluate methodological challenges and regulatory considerations of indirect treatment comparisons (ITCs) with the analysis of international health technology assessment guidelines and French Transparency Committee (TC) decisions. We conducted a pragmatic review of ITC guidelines from major health technology assessment (HTA) bodies and multistakeholder organizations. Then, we analyzed TC opinions published between 2021-2023. We extracted data on ITC methodology, therapeutic areas, acceptability, and limitations expressed by the TC. The targeted review of the main guidelines showed mainly agreements between HTA bodies and multistakeholder organizations, with some specificities. 138 TC opinions containing 195 ITCs were analyzed. Only 13.3% of these ITCs influenced TC decision-making. ITCs were more frequently accepted in genetic diseases (34.4%) compared to oncology (10.0%) and autoimmune diseases (11.1%). Methods using individual patient data showed higher acceptance rates (23.1%) than network meta-analyses (4.2%). Main limitations included heterogeneity/bias risk (59%), lack of data (48%), statistical methodology issues (29%), study design concerns (27%), small sample size (25%), and outcome definition variability (20%). When ITCs were the primary source of evidence, the proportion of important clinical benefit was lower (60.9% vs. 73.4%) than when randomized controlled trials were available. While ITCs are increasingly submitted, particularly where direct evidence is impractical, their influence on reimbursement decisions remains limited. There is a need for clear and accessible guides so manufacturers can produce clearer and more robust ITCs that follow regulatory guidelines, from the planning phase to execution.

Subject: Methodology

Publish: 2025-06-13 08:50:00 UTC


#5 A Two-step Estimating Approach for Heavy-tailed AR Models with General Non-zero Median Noises [PDF] [Copy] [Kimi] [REL]

Authors: Rui She, Linlin Dai, Shiqing Ling

This paper develops a novel two-step estimating procedure for heavy-tailed AR models with non-zero median noises, allowing for time-varying volatility. We first establish the self-weighted quantile regression estimator (SQE) across all quantile levels $\tau\in (0,1)$ for the AR parameters $\theta_{0}$. We show that the SQE, minus a bias, converges weakly to a Gaussian process uniformly at a rate of $n^{-1/2}$. The bias is zero if and only if $\tau$ equals $\tau_{0}$, the probability that the noise is less than zero. Based on the SQE, we propose an approach to estimate $\tau_{0}$ in the second step and {feed the estimated $\hat{\tau}_n$ back into the SQE to estimate $\theta_0$.} Both the estimated $\tau_{0}$ and $\theta_{0}$ are shown to be consistent and asymptotically normal. A random weighting bootstrap method is developed to approximate the complicated distribution. The problem we study is non-standard because $\tau_{0}$ may not be identifiable in conventional quantile regression, and the usual methods cannot verify the existence of the SQE bias. Unlike existing procedures for heavy-tailed time series, our method does not require any classical identification conditions, such as zero-mean or zero-median.

Subject: Methodology

Publish: 2025-06-13 07:09:32 UTC


#6 Data Integration With Biased Summary Data via Generalized Entropy Balancing [PDF] [Copy] [Kimi] [REL]

Authors: Kosuke Morikawa, Sho Komukai, Satoshi Hattori

Statistical methods for integrating individual-level data with external summary data have attracted attention because of their potential to reduce data collection costs. Summary data are often accessible through public sources and relatively easy to obtain, making them a practical resource for enhancing the precision of statistical estimation. Typically, these methods assume that internal and external data originate from the same underlying distribution. However, when this assumption is violated, incorporating external data introduces the risk of bias, primarily due to differences in background distributions between the current study and the external source. In practical applications, the primary interest often lies not in statistical quantities related specifically to the external data distribution itself, but in the individual-level internal data. In this paper, we propose a methodology based on generalized entropy balancing, designed to integrate external summary data even if derived from biased samples. Our method demonstrates double robustness, providing enhanced protection against model misspecification. Importantly, the applicability of our method can be assessed directly from the available data. We illustrate the versatility and effectiveness of the proposed estimator through an analysis of Nationwide Public-Access Defibrillation data in Japan.

Subject: Methodology

Publish: 2025-06-13 06:11:21 UTC


#7 A Review and Comparison of Different Sensitivity Analysis Techniques in Practice [PDF] [Copy] [Kimi] [REL]

Authors: Devin Francom, Abigael Nachtsheim

There exist many methods for sensitivity analysis readily available to the practitioner. While each seeks to help the modeler answer the same general question -- How do sources of uncertainty or changes in the model inputs relate to uncertainty in the output? -- different methods are associated with different assumptions, constraints, and required resources, leading to conclusions that may vary in interpretability and level of detail. Thus, it is crucial that the practitioner selects the desired sensitivity analysis method judiciously, making sure to match the selected approach to the specifics of their problem and to their desired objectives. In this chapter, we provide a practical overview of a collection of widely used, widely available sensitivity analysis methods. We focus on global sensitivity approaches, which seek to characterize how uncertainty in the model output may be allocated to sources of uncertainty in model inputs across the entire input space. Generally, this will require the practitioner to specify a probability distribution over the input space. On the other hand, methods for local sensitivity analysis do not require this specification but they have more limited utility, providing insight into sources of uncertainty associated only with a particular, specified location in the input space. Our hope is that this chapter may serve as a decision-making tool for practitioners, helping to guide the selection of a sensitivity analysis approach that will best fit their needs. To support this goal, we have selected a suite of approaches to cover, which, while not exhaustive, we believe provides a flexible and robust sensitivity analysis toolkit. All methods included are widely used and available in standard software packages.

Subject: Methodology

Publish: 2025-06-13 05:18:47 UTC


#8 Local empirical Bayes correction for Bayesian modeling [PDF] [Copy] [Kimi] [REL]

Author: Yoshiko Hayashi

The James-Stein estimator has attracted much interest as a shrinkage estimator that yields better estimates than the maximum likelihood estimator. The James-Stein estimator is also very useful as an argument in favor of empirical Bayesian methods. However, for problems involving large-scale data, such as differential gene expression data, the distribution is considered a mixture distribution with different means that cannot be considered sufficiently close. Therefore, it is not appropriate to apply the James-Stein estimator. Efron (2011) proposed a local empirical Bayes correction that attempted to correct a selection bias for large-scale data.

Subjects: Methodology , Statistics Theory

Publish: 2025-06-13 02:43:13 UTC


#9 Node Splitting SVMs for Survival Trees Based on an L2-Regularized Dipole Splitting Criteria [PDF] [Copy] [Kimi] [REL]

Authors: Aye Aye Maung, Drew Lazar, Qi Zheng

This paper proposes a novel, node-splitting support vector machine (SVM) for creating survival trees. This approach is capable of non-linearly partitioning survival data which includes continuous, right-censored outcomes. Our method improves on an existing non-parametric method, which uses at most oblique splits to induce survival regression trees. In the prior work, these oblique splits were created via a non-SVM approach, by minimizing a piece-wise linear objective, called a dipole splitting criterion, constructed from pairs of covariates and their associated survival information. We extend this method by enabling splits from a general class of non-linear surfaces. We achieve this by ridge regularizing the dipole-splitting criterion to enable application of kernel methods in a manner analogous to classical SVMs. The ridge regularization provides robustness and can be tuned. Using various kernels, we induce both linear and non-linear survival trees to compare their sizes and predictive powers on real and simulated data sets. We compare traditional univariate log-rank splits, oblique splits using the original dipole-splitting criterion and a variety of non-linear splits enabled by our method. In these tests, trees created by non-linear splits, using polynomial and Gaussian kernels show similar predictive power while often being of smaller sizes compared to trees created by univariate and oblique splits. This approach provides a novel and flexible array of survival trees that can be applied to diverse survival data sets.

Subjects: Methodology , Machine Learning

Publish: 2025-06-13 02:31:29 UTC


#10 Filtrated Grouping in Multiple Functional Regression [PDF] [Copy] [Kimi] [REL]

Authors: Shuhao Jiao, Hernando Ombao, Ian W. McKeague

In this article, we develop a novel covariate grouping framework in the context of multiple functional regression, in which a scalar response is associated with multiple functional covariates. We apply this approach to examine the relationship between chronological age and gait angular kinematics in a cohort of healthy individuals. This application is motivated by the need to understand and communicate the risk of chronic joint disease associated with aging by studying how age influences gait patterns. A key challenge stems from the significant interdependence among various joints, which provides important insights into how movement coordination evolves with aging. This limitation drives the primary objective of this work: to develop an efficient methodology to unravel both the association between chronological age and joint kinematics, and the coordination across different joints. To achieve this goal, we develop a forest-structured covariate grouping framework in which different functional covariates are aggregated hierarchically based on the level of coefficient homogeneity. This approach allows for the analysis of both common and idiosyncratic effects of covariates in a nuanced, multi-resolution manner. The identification of the forest structure is entirely data-driven and requires no prior knowledge, providing valuable insights into the interdependence among covariates. Compared to existing methods, the proposed regression framework demonstrates superior predictive power and offers more insightful interpretability. In addition, the proposed framework is broadly applicable and can be readily extended to analyze other types of multivariate functional data.

Subjects: Methodology , Computation

Publish: 2025-06-13 00:08:41 UTC


#11 Coefficient Shape Transfer Learning for Functional Linear Regression [PDF] [Copy] [Kimi] [REL]

Authors: Shuhao Jiao, Ian W. Mckeague, N. -H. Chan

In this paper, we develop a novel transfer learning methodology to tackle the challenge of data scarcity in functional linear models. The methodology incorporates samples from the target model (target domain) alongside those from auxiliary models (source domains), transferring knowledge of coefficient shape from the source domains to the target domain. This shape-based knowledge transfer offers two key advantages. First, it is robust to covariate scaling, ensuring effectiveness despite variations in data distributions across different source domains. Second, the notion of coefficient shape homogeneity represents a meaningful advance beyond traditional coefficient homogeneity, allowing the method to exploit a wider range of source domains and achieve significantly improved model estimation. We rigorously analyze the convergence rates of the proposed estimator and examine the minimax optimality. Our findings show that the degree of improvement depends not only on the similarity of coefficient shapes between the target and source domains, but also on coefficient magnitudes and the spectral decay rates of the functional covariates covariance operators. To address situations where only a subset of auxiliary models is informative for the target model, we further develop a data-driven procedure for identifying such informative sources. The effectiveness of the proposed methodology is demonstrated through comprehensive simulation studies and an application to occupation time analysis using physical activity data from the U.S. National Health and Nutrition Examination Survey.

Subjects: Methodology , Machine Learning

Publish: 2025-06-13 00:00:43 UTC


#12 Rating competitors in games with strength-dependent tie probabilities [PDF] [Copy] [Kimi] [REL]

Author: Mark E. Glickman

Competitor rating systems for head-to-head games are typically used to measure playing strength from game outcomes. Ratings computed from these systems are often used to select top competitors for elite events, for pairing players of similar strength in online gaming, and for players to track their own strength over time. Most implemented rating systems assume only win/loss outcomes, and treat occurrences of ties as the equivalent to half a win and half a loss. However, in games such as chess, the probability of a tie (draw) is demonstrably higher for stronger players than for weaker players, so that rating systems ignoring this aspect of game results may produce strength estimates that are unreliable. We develop a new rating system for head-to-head games based on a model by Glickman (2025) that explicitly acknowledges that a tie may depend on the strengths of the competitors. The approach uses a Bayesian dynamic modeling framework. Within each time period, posterior updates are computed in closed form using a single Newton-Raphson iteration evaluated at the prior mean. The approach is demonstrated on a large dataset of chess games played in International Correspondence Chess Federation tournaments.

Subject: Methodology

Publish: 2025-06-12 23:06:42 UTC


#13 Bayesian Sensitivity Analysis for Causal Estimation with Time-varying Unmeasured Confounding [PDF] [Copy] [Kimi] [REL]

Authors: Yushu Zou, Liangyuan Hu, Amanda Ricciuto, Mark Deneau, Kuan Liu

Causal inference relies on the untestable assumption of no unmeasured confounding. Sensitivity analysis can be used to quantify the impact of unmeasured confounding on causal estimates. Among sensitivity analysis methods proposed in the literature for unmeasured confounding, the latent confounder approach is favoured for its intuitive interpretation via the use of bias parameters to specify the relationship between the observed and unobserved variables and the sensitivity function approach directly characterizes the net causal effect of the unmeasured confounding without explicitly introducing latent variables to the causal models. In this paper, we developed and extended two sensitivity analysis approaches, namely the Bayesian sensitivity analysis with latent confounding variables and the Bayesian sensitivity function approach for the estimation of time-varying treatment effects with longitudinal observational data subjected to time-varying unmeasured confounding. We investigated the performance of these methods in a series of simulation studies and applied them to a multi-center pediatric disease registry data to provide practical guidance on their implementation.

Subject: Methodology

Publish: 2025-06-12 21:41:27 UTC


#14 Variance estimation after matching or re-weighting [PDF] [Copy] [Kimi] [REL]

Authors: Xiang Meng, Aaron Smith, Luke Miratrix

This paper develops a variance estimation framework for matching estimators that enables valid population inference for treatment effects. We provide theoretical analysis of a variance estimator that addresses key limitations in the existing literature. While Abadie and Imbens (2006) proposed a foundational variance estimator requiring matching for both treatment and control groups, this approach is computationally prohibitive and rarely used in practice. Our method provides a computationally feasible alternative that only requires matching treated units to controls while maintaining theoretical validity for population inference. We make three main contributions. First, we establish consistency and asymptotic normality for our variance estimator, proving its validity for average treatment effect on the treated (ATT) estimation in settings with small treated samples. Second, we develop a generalized theoretical framework with novel regularity conditions that significantly expand the class of matching procedures for which valid inference is available, including radius matching, M-nearest neighbor matching, and propensity score matching. Third, we demonstrate that our approach extends naturally to other causal inference estimators such as stable balancing weighting methods. Through simulation studies across different data generating processes, we show that our estimator maintains proper coverage rates while the state-of-the-art bootstrap method can exhibit substantial undercoverage (dropping from 95% to as low as 61%), particularly in settings with extensive control unit reuse. Our framework provides researchers with both theoretical guarantees and practical tools for conducting valid population inference across a wide range of causal inference applications. An R package implementing our method is available at https://github.com/jche/scmatch2.

Subjects: Methodology , Statistics Theory

Publish: 2025-06-12 21:34:48 UTC


#15 Measuring multi-calibration [PDF] [Copy] [Kimi] [REL]

Authors: Ido Guy, Daniel Haimovich, Fridolin Linder, Nastaran Okati, Lorenzo Perini, Niek Tax, Mark Tygert

A suitable scalar metric can help measure multi-calibration, defined as follows. When the expected values of observed responses are equal to corresponding predicted probabilities, the probabilistic predictions are known as "perfectly calibrated." When the predicted probabilities are perfectly calibrated simultaneously across several subpopulations, the probabilistic predictions are known as "perfectly multi-calibrated." In practice, predicted probabilities are seldom perfectly multi-calibrated, so a statistic measuring the distance from perfect multi-calibration is informative. A recently proposed metric for calibration, based on the classical Kuiper statistic, is a natural basis for a new metric of multi-calibration and avoids well-known problems of metrics based on binning or kernel density estimation. The newly proposed metric weights the contributions of different subpopulations in proportion to their signal-to-noise ratios; data analyses' ablations demonstrate that the metric becomes noisy when omitting the signal-to-noise ratios from the metric. Numerical examples on benchmark data sets illustrate the new metric.

Subjects: Methodology , Artificial Intelligence , Machine Learning

Publish: 2025-06-12 19:48:10 UTC


#16 Regularized Estimation of the Loading Matrix in Factor Models for High-Dimensional Time Series [PDF] [Copy] [Kimi] [REL]

Authors: Xialu Liu, Xin Wang

High-dimensional data analysis using traditional models suffers from overparameterization. Two types of techniques are commonly used to reduce the number of parameters - regularization and dimension reduction. In this project, we combine them by imposing a sparse factor structure and propose a regularized estimator to further reduce the number of parameters in factor models. A challenge limiting the widespread application of factor models is that factors are hard to interpret, as both factors and the loading matrix are unobserved. To address this, we introduce a penalty term when estimating the loading matrix for a sparse estimate. As a result, each factor only drives a smaller subset of time series that exhibit the strongest correlation, improving the factor interpretability. The theoretical properties of the proposed estimator are investigated. The simulation results are presented to confirm that our algorithm performs well. We apply our method to Hawaii tourism data. The results indicate that two groups drive the number of domestic tourists in Hawaii: visitors from high-latitude states (factor 1) and visitors from inland or low-latitude states (factor 2). It reveals two main reasons people visit Hawaii: (1) to escape the cold and (2) to enjoy the beach and water activities.

Subject: Methodology

Publish: 2025-06-12 19:13:27 UTC


#17 Advancing clustering methods in physics education research: A case for mixture models [PDF] [Copy] [Kimi] [REL]

Authors: Minghui Wang, Meagan Sundstrom, Karen Nylund-Gibson, Marsha Ing

Clustering methods are often used in physics education research (PER) to identify subgroups of individuals within a population who share similar response patterns or characteristics. K-means (or k-modes, for categorical data) is one of the most commonly used clustering methods in PER. This algorithm, however, is not model-based: it relies on algorithmic partitioning and assigns individuals to subgroups with definite membership. Researchers must also conduct post-hoc analyses to relate subgroup membership to other variables. Mixture models offer a model-based alternative that accounts for classification errors and allows researchers to directly integrate subgroup membership into a broader latent variable framework. In this paper, we outline the theoretical similarities and differences between k-modes clustering and latent class analysis (one type of mixture model for categorical data). We also present parallel analyses using each method to address the same research questions in order to demonstrate these similarities and differences. We provide the data and R code to replicate the worked example presented in the paper for researchers interested in using mixture models.

Subjects: Methodology , Physics Education

Publish: 2025-06-12 19:04:53 UTC


#18 Longitudinal Omics Data Analysis: A Review on Models, Algorithms, and Tools [PDF] [Copy] [Kimi] [REL]

Authors: Ali R. Taheriyoun, Allen Ross, Abolfazl Safikhani, Damoon Soudbakhsh, Ali Rahnavard

Longitudinal omics data (LOD) analysis is essential for understanding the dynamics of biological processes and disease progression over time. This review explores various statistical and computational approaches for analyzing such data, emphasizing their applications and limitations. The main characteristics of longitudinal data, such as imbalancedness, high-dimensionality, and non-Gaussianity are discussed for modeling and hypothesis testing. We discuss the properties of linear mixed models (LMM) and generalized linear mixed models (GLMM) as foundation stones in LOD analyses and highlight their extensions to handle the obstacles in the frequentist and Bayesian frameworks. We differentiate in dynamic data analysis between time-course and longitudinal analyses, covering functional data analysis (FDA) and replication constraints. We explore classification techniques, single-cell as exemplary omics longitudinal studies, survival modeling, and multivariate methods for clinical/biomarker-based applications. Emerging topics, including data integration, clustering, and network-based modeling, are also discussed. We categorized the state-of-the-art approaches applicable to omics data, highlighting how they address the data features. This review serves as a guideline for researchers seeking robust strategies to analyze longitudinal omics data effectively, which is usually complex.

Subjects: Methodology , Quantitative Methods

Publish: 2025-06-11 18:30:43 UTC


#19 Bayesian Optimization with Inexact Acquisition: Is Random Grid Search Sufficient? [PDF] [Copy] [Kimi] [REL]

Authors: Hwanwoo Kim, Chong Liu, Yuxin Chen

Bayesian optimization (BO) is a widely used iterative algorithm for optimizing black-box functions. Each iteration requires maximizing an acquisition function, such as the upper confidence bound (UCB) or a sample path from the Gaussian process (GP) posterior, as in Thompson sampling (TS). However, finding an exact solution to these maximization problems is often intractable and computationally expensive. Reflecting such realistic situations, in this paper, we delve into the effect of inexact maximizers of the acquisition functions. Defining a measure of inaccuracy in acquisition solutions, we establish cumulative regret bounds for both GP-UCB and GP-TS without requiring exact solutions of acquisition function maximization. Our results show that under appropriate conditions on accumulated inaccuracy, inexact BO algorithms can still achieve sublinear cumulative regret. Motivated by such findings, we provide both theoretical justification and numerical validation for random grid search as an effective and computationally efficient acquisition function solver.

Subjects: Machine Learning , Machine Learning , Methodology

Publish: 2025-06-13 14:35:39 UTC


#20 The Space Between Us: A Methodological Framework for Researching Bonding and Proxemics in Situated Group-Agent Interactions [PDF] [Copy] [Kimi] [REL]

Authors: Ana Müller, Anja Richert

This paper introduces a multimethod framework for studying spatial and social dynamics in real-world group-agent interactions with socially interactive agents. Drawing on proxemics and bonding theories, the method combines subjective self-reports and objective spatial tracking. Applied in two field studies in a museum (N = 187) with a robot and a virtual agent, the paper addresses the challenges in aligning human perception and behavior. We focus on presenting an open source, scalable, and field-tested toolkit for future studies.

Subjects: Robotics , Human-Computer Interaction , Methodology

Publish: 2025-06-13 14:32:23 UTC


#21 Inference of Hierarchical Core-Periphery Structure in Temporal Network [PDF] [Copy] [Kimi] [REL]

Authors: Theodore Y. Faust, Mason A. Porter

Networks can have various types of mesoscale structures. One type of mesoscale structure in networks is core-periphery structure, which consists of densely-connected core nodes and sparsely-connected peripheral nodes. The core nodes are connected densely to each other and can be connected to the peripheral nodes, which are connected sparsely to other nodes. There has been much research on core-periphery structure in time-independent networks, but few core-periphery detection methods have been developed for time-dependent (i.e., ``temporal") networks. Using a multilayer-network representation of temporal networks and an inference approach that employs stochastic block models, we generalize a recent method for detecting hierarchical core-periphery structure \cite{Polanco23} from time-independent networks to temporal networks. In contrast to ``onion-like'' nested core-periphery structures (where each node is assigned to a group according to how deeply it is nested in a network's core), hierarchical core-periphery structures encompass networks with nested structures, tree-like structures (where any two groups must either be disjoint or have one as a strict subset of the other), and general non-nested mesoscale structures (where the group assignments of nodes do not have to be nested in any way). To perform statistical inference and thereby identify core-periphery structure, we use a Markov-chain Monte Carlo (MCMC) approach. We illustrate our method for detecting hierarchical core-periphery structure in two real-world temporal networks, and we briefly discuss the structures that we identify in these networks.

Subjects: Social and Information Networks , Physics and Society

Publish: 2025-06-11 19:31:05 UTC