Processing math: 14%

Quantitative Biology

2025-06-24 | | Total: 23

#1 An Analytical Neighborhood Enrichment Score for Spatial Omics [PDF] [Copy] [Kimi] [REL]

Authors: Axel Andersson, Hanna Nyström

The neighborhood enrichment test is used to quantify spatial enrichment and depletion between spatial points with categorical labels, which is a common data type in spatial omics. Traditionally, this test relies on a permutation-based Monte Carlo approach, which tends to be computationally expensive for large datasets. In this study, we present a modified version of the test that can be computed analytically. This analytical version showed a minimum Pearson correlation of 0.95 with the conventional Monte Carlo-based method across eight spatial omics datasets, but with substantial speed-ups. Additional experiments on a large Xenium dataset demonstrated the method's ability to efficiently analyze large-scale data, making it a valuable tool for analyzing spatial omics data.

Subject: Quantitative Methods

Publish: 2025-06-23 14:32:19 UTC


#2 Identifying the sources of noise synergy and redundancy in the feed-forward loop motif [PDF] [Copy] [Kimi] [REL]

Authors: Mintu Nandi, Sudip Chattopadhyay, Suman K Banik

The propagation of noise through parallel regulatory pathways is a prominent feature of feed-forward loops in genetic networks. Although the contributions of the direct and indirect regulatory pathways of feed-forward loops to output variability have been well characterized, the impact of their joint action arising from their shared input and output remains poorly understood. Here, we identify an additional component of noise that emerges specifically from this convergent nature of the pathways. Using inter-gene correlations, we reveal the regulatory basis of the additional noise and interpret it as synergy or redundancy in noise propagation, depending on whether the combined pathways amplify or suppress fluctuations. This framework not only accounts for previously observed differences in noise behavior across coherent and incoherent feed-forward loops but also provides a generalizable strategy to connect network structure with stochastic gene regulation.

Subjects: Molecular Networks , Biological Physics

Publish: 2025-06-23 13:27:34 UTC


#3 BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity [PDF] [Copy] [Kimi] [REL]

Authors: Moein Khajehnejad, Forough Habibollahi, Adeel Razi

Existing foundation models for neuroimaging are often prohibitively large and data-intensive. We introduce BrainSymphony, a lightweight, parameter-efficient foundation model that achieves state-of-the-art performance while being pre-trained on significantly smaller public datasets. BrainSymphony's strong multimodal architecture processes functional MRI data through parallel spatial and temporal transformer streams, which are then efficiently distilled into a unified representation by a Perceiver module. Concurrently, it models structural connectivity from diffusion MRI using a novel signed graph transformer to encode the brain's anatomical structure. These powerful, modality-specific representations are then integrated via an adaptive fusion gate. Despite its compact design, our model consistently outperforms larger models on a diverse range of downstream benchmarks, including classification, prediction, and unsupervised network identification tasks. Furthermore, our model revealed novel insights into brain dynamics using attention maps on a unique external psilocybin neuroimaging dataset (pre- and post-administration). BrainSymphony establishes that architecturally-aware, multimodal models can surpass their larger counterparts, paving the way for more accessible and powerful research in computational neuroscience.

Subjects: Quantitative Methods , Machine Learning , Neurons and Cognition

Publish: 2025-06-23 06:00:21 UTC


#4 Single-Cell Proteomic Technologies: Tools in the quest for principles [PDF] [Copy] [Kimi] [REL]

Author: Nikolai Slavov

Over the last decade, proteomic analysis of single cells by mass spectrometry transitioned from an uncertain possibility to a set of robust and rapidly advancing technologies supporting the accurate quantification of thousands of proteins. We review the major drivers of this progress, from establishing feasibility to powerful and increasingly scalable methods. We focus on the tradeoffs and synergies of different technological solutions within a coherent conceptual framework, which projects considerable room both for throughput scaling and for extending the analysis scope to functional protein measurements. We highlight the potential of these technologies to support the development of mechanistic biophysical models and help uncover new principles.

Subjects: Quantitative Methods , Biomolecules , Molecular Networks

Publish: 2025-06-22 23:12:49 UTC


#5 Perceptual multistability: a window for a multi-facet understanding of psychiatric disorders [PDF] [Copy] [Kimi] [REL]

Authors: Shervin Safavi, Danaé Rolland, Philipp Sterzer, Renaud Jardri, Pantelis Leptourgos

Perceptual multistability, observed across species and sensory modalities, offers valuable insights into numerous cognitive functions and dysfunctions. For instance, differences in temporal dynamics and information integration during percept formation often distinguish clinical from non-clinical populations. Computational psychiatry can elucidate these variations, through two primary approaches: (i) Bayesian modeling, which treats perception as an unconscious inference, and (ii) an active, information-seeking perspective (e.g., reinforcement learning) framing perceptual switches as internal actions. Our synthesis aims to leverage multistability to bridge these computational psychiatry subfields, linking human and animal studies as well as connecting behavior to underlying neural mechanisms. Perceptual multistability emerges as a promising non-invasive tool for clinical applications, facilitating translational research and enhancing our mechanistic understanding of cognitive processes and their impairments.

Subject: Neurons and Cognition

Publish: 2025-06-22 21:19:52 UTC


#6 Six Decades Post-Discovery of Taylor's Power Law: From Ecological and Statistical Universality, Through Prime Number Distributions and Tipping-Point Signals, to Heterogeneity and Stability of Complex Networks [PDF] [Copy] [Kimi] [REL]

Authors: Zhanshan, Ma, R. A. J. Taylor

First discovered by L. R. Taylor (1961, Nature), Taylor's Power Law (TPL) correlates the mean (M) population abundances and the corresponding variances (V) across a set of insect populations using a power function (V=aM^b). TPL has demonstrated its 'universality' across numerous fields of sciences, social sciences, and humanities. This universality has inspired two main prongs of exploration: one from mathematicians and statisticians, who might instinctively respond with a convergence theorem similar to the central limit theorem of the Gaussian distribution, and another from biologists, ecologists, physicists, etc., who are more interested in potential underlying ecological or organizational mechanisms. Over the past six decades, TPL studies have produced a punctuated landscape with three relatively distinct periods (1960s-1980s; 1990s-2000s, and 2010s-2020s) across the two prongs of abstract and physical worlds. Eight themes have been identified and reviewed on this landscape, including population spatial aggregation and ecological mechanisms, TPL and skewed statistical distributions, mathematical/statistical mechanisms of TPL, sample vs. population TPL, population stability, synchrony, and early warning signals for tipping points, TPL on complex networks, TPL in macrobiomes, and in microbiomes. Three future research directions including fostering reciprocal interactions between the two prongs, heterogeneity measuring, and exploration in the context of evolution. The significance of TPL research includes practically, population fluctuations captured by TPL are relevant for agriculture, forestry, fishery, wildlife-conservation, epidemiology, tumor heterogeneity, earthquakes, social inequality, stock illiquidity, financial stability, tipping point events, etc.; theoretically, TPL is one form of power laws, which are related to phase transitions, universality, scale-invariance, etc.

Subjects: Other Quantitative Biology , Computational Engineering, Finance, and Science , Information Theory , Social and Information Networks , Populations and Evolution

Publish: 2025-06-22 19:47:16 UTC


#7 The Relationship between Cognition and Computation: "Global-first" Cognition versus Local-first Computation [PDF] [Copy] [Kimi] [REL]

Author: Lin Chen

What fundamental research questions are essential for advancing toward brain-like AI or AGI (Artificial General Intelligence) capable of performing any intellectual task a human can? Should it be something like the Turing machine (1936), which answers the question "What is computation?" and lays the foundation for the entire field of computer science? Or should it be something like Shannon's mathematical theory of communication (1948), which answers the question "What is information?" and forms the basis for modern communication technology? We believe the key question today is the relationship between cognition and computation (RCC). For example, the widely discussed question "Will artificial intelligence replace the human mind?" is, in essence and in scientific terms, an issue concerning RCC. We have chosen to classify RCC into four categories: 1. The relationship between the primitives of cognition and the primitives of computation. 2. The relationship between the anatomical structure of neural representation of cognition and the computational architecture of artificial intelligence. 3. The relationship between emergents in cognition and emergents in computation. 4. The relationship between the mathematical foundations of cognition and computation.

Subject: Neurons and Cognition

Publish: 2025-06-22 09:56:58 UTC


#8 OmniESI: A unified framework for enzyme-substrate interaction prediction with progressive conditional deep learning [PDF1] [Copy] [Kimi] [REL]

Authors: Zhiwei Nie, Hongyu Zhang, Hao Jiang, Yutian Liu, Xiansong Huang, Fan Xu, Jie Fu, Zhixiang Ren, Yonghong Tian, Wen-Bin Zhang, Jie Chen

Understanding and modeling enzyme-substrate interactions is crucial for catalytic mechanism research, enzyme engineering, and metabolic engineering. Although a large number of predictive methods have emerged, they do not incorporate prior knowledge of enzyme catalysis to rationally modulate general protein-molecule features that are misaligned with catalytic patterns. To address this issue, we introduce a two-stage progressive framework, OmniESI, for enzyme-substrate interaction prediction through conditional deep learning. By decomposing the modeling of enzyme-substrate interactions into a two-stage progressive process, OmniESI incorporates two conditional networks that respectively emphasize enzymatic reaction specificity and crucial catalysis-related interactions, facilitating a gradual feature modulation in the latent space from general protein-molecule domain to catalysis-aware domain. On top of this unified architecture, OmniESI can adapt to a variety of downstream tasks, including enzyme kinetic parameter prediction, enzyme-substrate pairing prediction, enzyme mutational effect prediction, and enzymatic active site annotation. Under the multi-perspective performance evaluation of in-distribution and out-of-distribution settings, OmniESI consistently delivered superior performance than state-of-the-art specialized methods across seven benchmarks. More importantly, the proposed conditional networks were shown to internalize the fundamental patterns of catalytic efficiency while significantly improving prediction performance, with only negligible parameter increases (0.16%), as demonstrated by ablation studies on key components. Overall, OmniESI represents a unified predictive approach for enzyme-substrate interactions, providing an effective tool for catalytic mechanism cracking and enzyme engineering with strong generalization and broad applicability.

Subjects: Biomolecules , Artificial Intelligence

Publish: 2025-06-22 09:40:40 UTC


#9 AbRank: A Benchmark Dataset and Metric-Learning Framework for Antibody-Antigen Affinity Ranking [PDF] [Copy] [Kimi] [REL]

Authors: Chunan Liu, Aurelien Pelissier, Yanjun Shao, Lilian Denzler, Andrew C. R. Martin, Brooks Paige, Mariia Rodriguez Martinez

Accurate prediction of antibody-antigen (Ab-Ag) binding affinity is essential for therapeutic design and vaccine development, yet the performance of current models is limited by noisy experimental labels, heterogeneous assay conditions, and poor generalization across the vast antibody and antigen sequence space. We introduce AbRank, a large-scale benchmark and evaluation framework that reframes affinity prediction as a pairwise ranking problem. AbRank aggregates over 380,000 binding assays from nine heterogeneous sources, spanning diverse antibodies, antigens, and experimental conditions, and introduces standardized data splits that systematically increase distribution shift, from local perturbations such as point mutations to broad generalization across novel antigens and antibodies. To ensure robust supervision, AbRank defines an m-confident ranking framework by filtering out comparisons with marginal affinity differences, focusing training on pairs with at least an m-fold difference in measured binding strength. As a baseline for the benchmark, we introduce WALLE-Affinity, a graph-based approach that integrates protein language model embeddings with structural information to predict pairwise binding preferences. Our benchmarks reveal significant limitations in current methods under realistic generalization settings and demonstrate that ranking-based training improves robustness and transferability. In summary, AbRank offers a robust foundation for machine learning models to generalize across the antibody-antigen space, with direct relevance for scalable, structure-aware antibody therapeutic design.

Subjects: Biomolecules , Machine Learning

Publish: 2025-06-21 23:34:46 UTC


#10 Rethinking Ecological Measures Of Functional Diversity [PDF] [Copy] [Kimi] [REL]

Authors: Ines Meraoumia, Adji Bousso Dieng

Understanding functional diversity, the range and variability of species' roles and actions within their communities, is key to predicting and preserving the functions that sustain both nature and human well-being. In this paper, we provide a comprehensive review of the literature on functional diversity measurement. We begin by consolidating essential criteria that effective measures of functional diversity should meet. We then evaluate fifteen widely used functional diversity metrics against these criteria and assess their performance across six synthetic ecosystem scenarios where optimal behavior is known. Surprisingly, our analysis reveals that none of the widely used metrics fully satisfy all the established requirements, and all fail in at least one ecosystem scenario. In particular, we find that almost all metrics flagrantly violate set monotonicity and distance monotonicity, requirements that adding a novel species should increase diversity, and that the magnitude of that increase should grow with trait dissimilarity. We also find that metrics fail to decline when rare, functionally extreme species are lost, and even increase when a perfectly redundant species is added. These critical flaws leave them blind to the very biodiversity loss that functional diversity measures are intended to detect. Our findings underscore the urgent need to develop a new generation of functional diversity metrics that more accurately reflect ecological realities.

Subject: Populations and Evolution

Publish: 2025-06-21 22:29:55 UTC


#11 Improving Genomic Models via Task-Specific Self-Pretraining [PDF] [Copy] [Kimi] [REL]

Authors: Sohan Mupparapu, Parameswari Krishnamurthy, Ratish Puduppully

Pretraining DNA language models (DNALMs) on the full human genome is resource-intensive, yet often considered necessary for strong downstream performance. Inspired by recent findings in NLP and long-context modeling, we explore an alternative: self-pretraining on task-specific, unlabeled data. Using the BEND benchmark, we show that DNALMs trained with self-pretraining match or exceed the performance of models trained from scratch under identical compute. While genome-scale pretraining may still offer higher absolute performance, task-specific self-pretraining provides a practical and compute-efficient strategy for building stronger supervised baselines.

Subject: Genomics

Publish: 2025-06-21 17:19:21 UTC


#12 Perceptual Rationality: An Evolutionary Game Theory of Perceptually Rational Decision-Making [PDF] [Copy] [Kimi] [REL]

Author: Mohammad Salahshour

Understanding how biological organisms make decisions is of fundamental importance in understanding behavior. Such an understanding within evolutionary game theory so far has been sought by appealing to bounded rationality. Here, we present a perceptual rationality framework in the context of group cooperative interactions, where individuals make rational decisions based on their evolvable perception of the environment. We show that a simple public goods game accounts for power law distributed perceptual diversity. Incorporating the evolution of social information use into the framework reveals that rational decision-making is a natural root of the evolution of consistent personality differences and power-law distributed behavioral diversity. The behavioral diversity, core to the perceptual rationality approach, can lead to ever-shifting polymorphism or cyclic dynamics, through which different rational personality types coexist and engage in mutualistic, complementary, or competitive and exploitative relationships. This polymorphism can lead to non-monotonic evolution as external environmental conditions change. The framework provides predictions consistent with some large-scale eco-evolutionary patterns and illustrates how the evolution of social structure can modify large-scale eco-evolutionary patterns. Furthermore, consistent with most empirical evidence and in contrast to most theoretical predictions, our work suggests diversity is often detrimental to public good provision, especially in strong social dilemmas.

Subjects: Populations and Evolution , Computer Science and Game Theory , Physics and Society

Publish: 2025-06-21 14:47:46 UTC


#13 Modeling and Inferring Metacommunity Dynamics with Maximum Caliber [PDF] [Copy] [Kimi] [REL]

Authors: Zachary Jackson, Mathew A. Leibold, Robert D. Holt, BingKan Xue

A major challenge for community ecology is to use distribution patterns to infer basic parameters of dynamical models without conducting laborious experimental manipulations. We present a novel framework drawn from statistical physics -- Maximum Caliber -- for characterizing the temporal dynamics of complex ecological systems in spatially extended landscapes and inferring parameters from temporal data. As an extension of Maximum Entropy modeling, Maximum Caliber models the probability of possible trajectories of a stochastic system, rather than focusing on system states. We demonstrate the ability of the Maximum Caliber framework to capture ecological processes ranging from near- to far- from-equilibrium, using an array of species interaction motifs including random interactions, apparent competition, intraguild competition, and non-transitive competition, along with dispersal among multiple patches. For spatio-temporal data of species occurrence in a metacommunity, the parameters of a Maximum Caliber model can be estimated through a simple logistic regression to reveal migration rates between patches, magnitudes of interactions between species, and effects of intrinsic local environmental suitabilities. We test the accuracy of the method over a range of system sizes and time periods, and find that these parameters can be estimated without bias. We introduce entropy production as a system-level measure of disequilibrium, and use ``pseudo-R2'' to characterize the predictability of the system. We show that our model can predict the dynamics of metacommunities much better than steady state models, when the system is far from equilibrium. The capacity to estimate basic parameters of dynamical metacommunity models from spatio-temporal data represents an important breakthrough for the study of metacommunities with application to practical problems in conservation and restoration ecology.

Subject: Populations and Evolution

Publish: 2025-06-20 22:06:26 UTC


#14 Inferring Exocytosis Profiles from Cell Shapes Using a Dual-Configuration Model of Walled Cell Tip Growth [PDF] [Copy] [Kimi] [REL]

Authors: Kamryn Spinelli, Chaozhen Wei, Luis Vidali, Min Wu

Tip growth in filamentous cells, such as root hairs, moss protonemata, and fungal hyphae, depends on coordinated cell wall extension driven by turgor pressure, wall mechanics, and exocytosis. We introduce a dual-configuration model that incorporates both turgid and unturgid states to describe cell wall growth as the combined effect of elastic deformation and irreversible extension. This framework infers exocytosis profiles directly from cell morphology and elastic stretches, formulated as an initial value problem based on the self-similarity condition. Applying the model to Medicago truncatula root hairs, moss Physcomitrium patens protonemata, and hyphoid-like shapes, we find that exocytosis peaks at the tip in tapered cells but shifts to an annular region away from the apex in flatter-tip cells beyond a threshold. The model generalizes previous fluid models and provides a mechanistic link between exocytosis distribution and cell shape, explaining observed variations in tip-growing cells across species.

Subject: Cell Behavior

Publish: 2025-06-20 20:27:47 UTC


#15 Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search [PDF] [Copy] [Kimi] [REL]

Authors: Nikolaus Salvatore, Qiong Zhang

Past work has long recognized the important role of context in guiding how humans search their memory. While context-based memory models can explain many memory phenomena, it remains unclear why humans develop such architectures over possible alternatives in the first place. In this work, we demonstrate that foundational architectures in neural machine translation -- specifically, recurrent neural network (RNN)-based sequence-to-sequence models with attention -- exhibit mechanisms that directly correspond to those specified in the Context Maintenance and Retrieval (CMR) model of human memory. Since neural machine translation models have evolved to optimize task performance, their convergence with human memory models provides a deeper understanding of the functional role of context in human memory, as well as presenting new ways to model human memory. Leveraging this convergence, we implement a neural machine translation model as a cognitive model of human memory search that is both interpretable and capable of capturing complex dynamics of learning. We show that our model accounts for both averaged and optimal human behavioral patterns as effectively as context-based memory models. Further, we demonstrate additional strengths of the proposed model by evaluating how memory search performance emerges from the interaction of different model components.

Subjects: Neurons and Cognition , Machine Learning

Publish: 2025-06-20 18:43:15 UTC


#16 Challenges in Grounding Language in the Real World [PDF] [Copy] [Kimi] [REL]

Authors: Peter Lindes, Kaoutar Skiker

A long-term goal of Artificial Intelligence is to build a language understanding system that allows a human to collaborate with a physical robot using language that is natural to the human. In this paper we highlight some of the challenges in doing this, and propose a solution that integrates the abilities of a cognitive agent capable of interactive task learning in a physical robot with the linguistic abilities of a large language model. We also point the way to an initial implementation of this approach.

Subjects: Neurons and Cognition , Artificial Intelligence

Publish: 2025-06-20 17:17:53 UTC


#17 Stumbling around uncharted regulatory structures: NAcrins, or the perspective of specialized sources of modulatory non-coding RNAs [PDF] [Copy] [Kimi] [REL]

Author: Marouane Benzaki

The revelation of the supreme authority of nucleic acids in the cellular landscape has precipitated the recognition of the versatility of RNAs in cells. The subsequent discovery of non-coding RNAs was a major breakthrough that revealed their extensive involvement in virtually all physiological processes within the cell. Beyond the barriers of the cell, the current perception seems to support the idea of their participation in intercellular regulation and cross-kingdom communication. However, the presence of non-coding RNAs in the extracellular environment remains essentially a mystery, and the understanding of the significance and the processes governing this presence faces several constraints. This has led us to forge an original and predictive idea that seems to allow an emancipation from the various constraints posed in the current perception of the cited phenomena. In this paper, we will attempt to explore the extent of the probable existence of cellular organizations specializing in the production and management of non-coding RNAs. We will try, through the development of this hypothesis, to draw a picture explaining the significance and logistics of extracellular non-coding RNAs, with an emphasis on microRNAs. This exercise will be realized while relying on and confronting purely theoretical points of view, as well as relevant experimental results. In this manuscript, we will address the presumed morphology, intracellular organization, selective export, transport, transfer, distribution, reception and intracellular function of non-coding RNAs, in the perspective of a regulation cycle orchestrated by NAcrins under normal or disturbed physiological contexts.

Subject: Tissues and Organs

Publish: 2025-06-18 18:12:49 UTC


#18 PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding [PDF2] [Copy] [Kimi3] [REL]

Authors: Kangcong Li, Peng Ye, Chongjun Tu, Lin Zhang, Chunfeng Song, Jiamin Wu, Tao Yang, Qihao Zheng, Tao Chen

While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural activations causing information decay and unstructured feed-forward network (FFN) weights leading to semantic fragmentation. Inspired by the brain's working memory and cortical modularity, we propose PaceLLM, featuring two innovations: (1) a Persistent Activity (PA) Mechanism that mimics prefrontal cortex (PFC) neurons' persistent firing by introducing an activation-level memory bank to dynamically retrieve, reuse, and update critical FFN states, addressing contextual decay; and (2) Cortical Expert (CE) Clustering that emulates task-adaptive neural specialization to reorganize FFN weights into semantic modules, establishing cross-token dependencies and mitigating fragmentation. Extensive evaluations show that PaceLLM achieves 6% improvement on LongBench's Multi-document QA and 12.5-17.5% performance gains on Infinite-Bench tasks, while extending measurable context length to 200K tokens in Needle-In-A-Haystack (NIAH) tests. This work pioneers brain-inspired LLM optimization and is complementary to other works. Besides, it can be generalized to any model and enhance their long-context performance and interpretability without structural overhauls.

Subjects: Neurons and Cognition , Computation and Language , Neural and Evolutionary Computing

Publish: 2025-06-18 09:17:06 UTC


#19 The Sex-Dependent Effects of Psychedelics on Myelination in APOE4 Mice [PDF] [Copy] [Kimi] [REL]

Author: Sanjana Shankar

Several studies have linked myelin abnormalities with neuropsychiatric disorders; others have implicated psychedelics as a potential therapeutic for such conditions. One risk factor for these demyelinating disorders is a mutation in the Apolipoprotein E gene known as APOE4. This variant impedes the cholesterol regulation of oligodendrocytes responsible for the myelination, or insulation, of neurons when compared to the wild-type phenotype. In this work, I advance knowledge of cellular pathways involved in the progression of APOE4-related diseases and elucidate the effects of psychedelics on the brain. Myelin sheaths are vital for maintaining neural pathways, and healthy oligodendrocytes serve as a prerequisite for axonal integrity. Further, the Kaufer Lab has observed significant behavioral differences between male and female APOE4 mice following psychedelic treatment with 2,5-Dimethoxy-4-iodoamphetamine, or DOI, a serotonin receptor ligand. The sex-dependent mechanisms influencing symptom differences and treatment outcomes in AD are unclear, and could be key to developing successful therapeutics for myelin-related issues. I hypothesize that administration of DOI will increase the myelination activity of oligodendrocytes in female APOE4 mice compared with their male counterparts or controls. Preliminary results show a significant increase in MBP in the CA1, or short-term, and CA2, or long term, areas in only female APOE4 mice post-introduction of DOI to the system. This aligns with behavioral data indicating fewer anxiety-related behaviors in female APOE4 mice after DOI administration. These findings reveal distinct biological mechanisms in male and female brain degeneration and suggest potential for sex-specific therapeutics.

Subjects: Neurons and Cognition , Tissues and Organs

Publish: 2025-06-16 23:29:02 UTC


#20 Perfect phylogenies via the Minimum Uncovering Branching problem: efficiently solvable cases [PDF] [Copy] [Kimi] [REL]

Authors: Narmina Baghirova, Esther Galby, Martin Milanič

In this paper, we present new efficiently solvable cases of the Minimum Uncovering Branching problem, an optimization problem with applications in cancer genomics introduced by Hujdurović, Husić, Milanič, Rizzi, and Tomescu in 2018. The problem involves a family of finite sets, and the goal is to map each non-maximal set to exactly one set that contains it, minimizing the sum of uncovered elements across all sets in the family. Hujdurović et al. formulated the problem in terms of branchings of the digraph formed by the proper set inclusion relation on the input sets and studied the problem complexity based on properties of the corresponding partially ordered set, in particular, with respect to its height and width, defined respectively as the maximum cardinality of a chain and an antichain. They showed that the problem is APX-complete for instances of bounded height and that a constant-factor approximation algorithm exists for instances of bounded width, but left the exact complexity for bounded-width instances open. In this paper, we answer this question by proving that the problem is solvable in polynomial time. We derive this result by examining the structural properties of optimal solutions and reducing the problem to computing maximum matchings in bipartite graphs and maximum weight antichains in partially ordered sets. We also introduce a new polynomially computable lower bound and identify another condition for polynomial-time solvability.

Subjects: Discrete Mathematics , Data Structures and Algorithms , Combinatorics , Populations and Evolution

Publish: 2025-06-23 12:29:44 UTC


#21 Projected Normal Distribution: Moment Approximations and Generalizations [PDF] [Copy] [Kimi] [REL]

Authors: Daniel Herrera-Esposito, Johannes Burge

The projected normal distribution, also known as the angular Gaussian distribution, is obtained by dividing a multivariate normal random variable \mathbf{x} by its norm \sqrt{\mathbf{x}^T \mathbf{x}}. The resulting random variable follows a distribution on the unit sphere. No closed-form formulas for the moments of the projected normal distribution are known, which can limit its use in some applications. In this work, we derive analytic approximations to the first and second moments of the projected normal distribution using Taylor expansions and using results from the theory of quadratic forms of Gaussian random variables. Then, motivated by applications in systems neuroscience, we present generalizations of the projected normal distribution that divide the variable \mathbf{x} by a denominator of the form \sqrt{\mathbf{x}^T \mathbf{B} \mathbf{x} + c}, where \mathbf{B} is a symmetric positive definite matrix and c is a non-negative number. We derive moment approximations as well as the density function for these other projected distributions. We show that the moments approximations are accurate for a wide range of dimensionalities and distribution parameters. Furthermore, we show that the moments approximations can be used to fit these distributions to data through moment matching. These moment matching methods should be useful for analyzing data across a range of applications where the projected normal distribution is used, and for applying the projected normal distribution and its generalizations to model data in neuroscience.

Subjects: Methodology , Neurons and Cognition

Publish: 2025-06-20 20:04:58 UTC


#22 A practical identifiability criterion leveraging weak-form parameter estimation [PDF] [Copy] [Kimi] [REL]

Authors: Nora Heitzman-Breen, Vanja Dukic, David M. Bortz

In this work, we define a practical identifiability criterion, (e, q)-identifiability, based on a parameter e, reflecting the noise in observed variables, and a parameter q, reflecting the mean-square error of the parameter estimator. This criterion is better able to encompass changes in the quality of the parameter estimate due to increased noise in the data (compared to existing criteria based solely on average relative errors). Furthermore, we leverage a weak-form equation error-based method of parameter estimation for systems with unobserved variables to assess practical identifiability far more quickly in comparison to output error-based parameter estimation. We do so by generating weak-form input-output equations using differential algebra techniques, as previously proposed by Boulier et al [1], and then applying Weak form Estimation of Nonlinear Dynamics (WENDy) to obtain parameter estimates. This method is computationally efficient and robust to noise, as demonstrated through two classical biological modelling examples.

Subjects: Methodology , Quantitative Methods

Publish: 2025-06-20 16:11:47 UTC


#23 AutomataGPT: Forecasting and Ruleset Inference for Two-Dimensional Cellular Automata [PDF] [Copy] [Kimi] [REL]

Authors: Jaime A. Berkovich, Noah S. David, Markus J. Buehler

Cellular automata (CA) provide a minimal formalism for investigating how simple local interactions generate rich spatiotemporal behavior in domains as diverse as traffic flow, ecology, tissue morphogenesis and crystal growth. However, automatically discovering the local update rules for a given phenomenon and using them for quantitative prediction remains challenging. Here we present AutomataGPT, a decoder-only transformer pretrained on around 1 million simulated trajectories that span 100 distinct two-dimensional binary deterministic CA rules on toroidal grids. When evaluated on previously unseen rules drawn from the same CA family, AutomataGPT attains 98.5% perfect one-step forecasts and reconstructs the governing update rule with up to 96% functional (application) accuracy and 82% exact rule-matrix match. These results demonstrate that large-scale pretraining over wider regions of rule space yields substantial generalization in both the forward (state forecasting) and inverse (rule inference) problems, without hand-crafted priors. By showing that transformer models can faithfully infer and execute CA dynamics from data alone, our work lays the groundwork for abstracting real-world dynamical phenomena into data-efficient CA surrogates, opening avenues in biology, tissue engineering, physics and AI-driven scientific discovery.

Subjects: Machine Learning , Disordered Systems and Neural Networks , Materials Science , Quantitative Methods

Publish: 2025-06-19 05:54:08 UTC