Quantitative Biology

2026-05-21 | | Total: 14

#1 A simple model of co-emergence of grid and place fields [PDF] [Copy] [Kimi] [REL]

Authors: Zhaoze Wang, Genela Morris, Dori Derdikman, Pratik Chaudhari, Vijay Balasubramanian

Grid cells in the medial entorhinal cortex and place cells in the hippocampus together support spatial navigation. The two regions are reciprocally connected, and there is a chicken-and-egg problem for how both arise and reinforce each other during development. Current computational accounts either derive one type from the other or use network dynamics to model the emergence of one type in isolation. We introduce a unified recurrent network model that instantiates Dale's Law (every neuron is either excitatory or inhibitory), and is trained to predict the next sensory observation from masked previous sensory observations and egocentric motion. To our knowledge, this is the first single-objective model in which grid and place cells co-emerge without supervision of either type, or reliance on pre-existing spatial-cell representations. The two kinds of spatial codes coexist across 1,000 different training configurations, with their balance set by the amount of sensory noise and masking. Without retraining, the network qualitatively reproduces experimentally observed grid fragmentation in hairpin mazes, grid merging after wall removal, lattice alignment across connected rooms, locally ordered 3D fields observed in freely flying bats, as well as the developmental order in which place cells precede grid cells. We interpret these results in terms of two complementary encoding pressures within a single sensory-prediction objective: (1) correcting errors or reconstructing missing components of sensory observations, and (2) prediction of the next sensory state during navigation. Our results suggest a circuit-level account of the co-emergence of grid and place cells, and experimentally testable predictions for the two kinds of spatial codes.

Subject: Neurons and Cognition

Publish: 2026-05-20 16:19:56 UTC


#2 Stimulus symmetries can confound representational similarity analyses [PDF] [Copy] [Kimi] [REL]

Authors: Farhad Pashakhanloo, Jacob A. Zavatone-Veth

What can representational similarity matrices (RSMs) tell us about a neural code? As the popularity of these summary statistics grows, so too does the need for a more complete characterization of their properties. Here, we show that symmetries in network inputs can confound RSM-based analyses. Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs. These different RSMs reflect qualitatively different representational geometries. We show that stochastic gradient descent or energetic regularization can generate sparse, drifting codes, leading in turn to drifting RSMs. Moreover, we demonstrate that these phenomena are present in networks trained to encode image data, where the symmetry is latent. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.

Subjects: Neurons and Cognition , Machine Learning

Publish: 2026-05-20 15:51:21 UTC


#3 Multi-Modal Machine Learning for Population- and Subject-Specific lncRNA-Type 2 Diabetes Association Analysis [PDF] [Copy] [Kimi] [REL]

Authors: Ashwani Siwach, Sanjeev Narayan Sharma, Sunil Datt Sharma

Long non-coding RNAs (lncRNAs) are emerging regulatory molecules implicated in chronic disease pathogenesis, including Type 2 Diabetes Mellitus (T2D). We investigated ten literature reported lncRNAs associated with T2D: MALAT1, MEG3, MIAT, ANRIL, GAS5, KCNQ1OT1, H19, BCYRN1, XIST, and HOTAIR across two independent population-based RNA-seq cohorts. Single-omics approaches provide an incomplete view of disease biology, therefore, an integrative multi-feature framework was developed, extracting expression, secondary-structure, and sequence features for each lncRNA. Eight machine learning (ML) classifiers were evaluated under stratified k-fold, leave-one-out cross-validation (LOOCV), and repeated hold-out schemes to ensure robust performance estimation. SHAP analysis was applied for subject-level association interpretation. In one cohort, GAS5 and XIST expression features, along with GAS5, MEG3, and ANRIL sequence features, were found to be associated with T2D, while MALAT1 expression and KCNQ1OT1, ANRIL, and MEG3 sequence features were found to be associated in the second cohort. MEG3 was identified by SHAP as the dominant lncRNA in both cohorts. ML results were consistent with established statistical methods while additionally providing population- and subject-level disease association profiles linked to specific molecular feature types. The proposed framework advances mechanistic understanding of T2D and supports lncRNA-based precision medicine.

Subject: Genomics

Publish: 2026-05-20 05:49:42 UTC


#4 Platonic Representations in the Human Brain: Unsupervised Recovery of Universal Geometry [PDF1] [Copy] [Kimi] [REL]

Authors: Pablo Marcos-Manchón, Rishi Jha, Lluís Fuentemilla

The Strong Platonic Representation Hypothesis suggests that representational convergence in artificial neural networks can be harnessed constructively: embeddings can be translated across models through a universal latent space without paired data. We ask whether an analogous geometry can be recovered across human brains. Using fMRI data from the Natural Scenes Dataset, we propose a self-supervised encoder that learns subject-specific embeddings from brain data alone by exploiting repeated stimulus presentations. We show that these independently learned spaces can be translated across subjects using unsupervised orthogonal rotations, without paired cross-subject samples or intermediate model representations. Synchronizing pairwise rotations into a single shared latent space further improves cross-subject retrieval, indicating that subject-specific spaces are mutually compatible with a common coordinate system. These results provide evidence for a shared neural geometry in the human visual cortex: subject-specific fMRI representations are approximately isometric across individuals and can be translated through purely geometric transformations.

Subjects: Neurons and Cognition , Computer Vision and Pattern Recognition

Publish: 2026-05-19 21:04:15 UTC


#5 ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction [PDF] [Copy] [Kimi] [REL]

Authors: Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Gregory Slabaugh

We introduce ProtoPathway, an interpretable-by-design multimodal framework for cancer survival prediction that unifies whole slide imaging and transcriptomics through encoders producing biologically grounded representations on both sides of the fusion. On the histopathology side, $K$ learnable morphological prototypes, trained end-to-end with the survival objective, serve as the slide representation itself: patches flow into prototype tokens via soft assignment, compressing variable-length patch sets into fixed task-adaptive tokens. On the genomic side, a bipartite graph neural network encodes gene expression within the Reactome pathway hierarchy, producing pathway embeddings that reflect both constituent genes and their broader biological context through bidirectional message passing over a shared gene--pathway graph. Cross-modal attention then operates over a compact prototype $\times$ pathway matrix in which prototypes query pathways, modeling the biological direction in which molecular programs give rise to tissue morphology. Because both axes carry stable task-learned identity, the attention matrix is itself an interpretability output, yielding native inference-time attribution across the full biological hierarchy, from genes through pathways and prototypes to spatial tissue maps. We evaluate on five TCGA cancer cohorts, demonstrating competitive or superior survival prediction with substantially improved biological interpretability and reduced computational cost, with interpretability claims validated through fold-stratified rank-based population-level analysis. Our source code, model weights, and Reactome pathways, together with a unified codebase reimplementing all multimodal survival baselines under identical preprocessing and evaluation, are available at: https://github.com/AmayaGS/ProtoPathway.

Subjects: Computer Vision and Pattern Recognition , Quantitative Methods , Tissues and Organs

Publish: 2026-05-20 17:43:43 UTC


#6 HiRes: Inspectable Precedent Memory for Reaction Condition Recommendation [PDF] [Copy] [Kimi] [REL]

Authors: Shreyas Vinaya Sathyanarayana, Raja Sekhar Pappala, Deepak Warrier

Reaction condition recommendation sits immediately after retrosynthetic disconnection selection, and in practice, chemists require both accurate predictions and the precedents that justify them. We present HiRes (Hierarchical Reaction Representations), a retrieval-augmented condition recommendation system whose learned reaction space serves as both a classifier feature and an inspectable precedent memory. The model combines a graph encoder, transformation-aware cross-attention, multi-stream reaction fusion, and a k-NN retrieval layer. HiRes achieves state-of-the-art performance among primary-slot USPTO-Condition models, reaching Catalyst, Solvent, and Reagent top-1 accuracies (Acc@1) of 0.929, 0.534, and 0.530 respectively. It ties the best reported baseline on Catalyst while outperforming models such as REACON on Solvent and Reagent. Furthermore, paired bootstrap analysis demonstrates that integrating retrieval with learned condition heads provides statistically significant gains for solvent and reagent selection over purely parametric approaches. Ultimately, HiRes bridges the gap between predictive accuracy and chemical interpretability, offering a single representation that supplies both competitive recommendations and the concrete chemical precedents necessary for practical synthesis planning.

Subjects: Machine Learning , Artificial Intelligence , Molecular Networks

Publish: 2026-05-20 17:14:46 UTC


#7 How hate spreads online and why it returns: Re-entrant phases driven by collective behavior [PDF] [Copy] [Kimi] [REL]

Authors: Chen Xu, Pak Ming Hui, Chenkai Xia, Neil F. Johnson

The 2025 Bondi Beach mass-shooting was perpetrated by individuals inspired by ISIS (Islamic State) propaganda that increasingly featured anti-Semitic hate content following the October 2023 start of the Israel-Palestine war. Similar stories hold for other types of hate attacks, e.g. against Muslims on May 18, 2026. There is an urgent need to get ahead of future threats by understanding how and when a newly created piece of hate content will spread system-wide online. We present a two-species coalescence-fragmentation model with Susceptible-Infected-Recovered dynamics that incorporates the following published empirical features: (1) New pieces of hate content tend to be generated and promoted by a subset of in-built communities on less regulated platforms. (2) These `hate' communities create links (hyperlinks) with each other and with non-hate communities across all platforms to form dynamically evolving clusters (i.e. coalescence) across which new hate content can then spread. (3) These clusters can get broken up by moderator shutdowns (i.e. fragmentation). We present numerical solutions and derive two levels of approximate mean-field theory: Effective Medium Theory (EMT) and Beyond Effective Medium Theory (BEMT). Both numerical and analytic solutions reveal that system-wide spreading is governed by re-entrant threshold phases: as the fraction of hate communities varies, the system can transition from spreading to no-spreading and back to spreading. The derived analytic formulae give explicit insight into how these phase boundaries might be manipulated to prevent system-wide spreading. More broadly, the re-entrant phase behavior warns that policies which steadily reduce the number of hate communities can initially succeed but then backfire if pushed further, suggesting that blanket requirements for platforms to simply do `more' are over-simplistic.

Subjects: Physics and Society , General Economics , Adaptation and Self-Organizing Systems , Populations and Evolution

Publish: 2026-05-20 13:01:56 UTC


#8 Modeling Temporal scRNA-seq Data with Latent Gaussian Process and Optimal Transport [PDF] [Copy] [Kimi] [REL]

Authors: Mehmet Yigit Balik, Harri Lähdesmäki

Single-cell RNA sequencing provides insights into gene expression at single-cell resolution, yet inferring temporal processes from these static snapshot measurements remains a fundamental challenge. Current approaches utilizing neural differential equations and flows are sensitive to overfitting and lack careful considerations of biological variability. In this work, we propose a generative framework that models population trends using a latent heteroscedastic Gaussian process (GP) approximated by Hilbert space methods. To address the absence of genuine cell trajectories, we leverage an optimal transport (OT) objective that aligns generated and observed population distributions. Our method explicitly captures biological heterogeneity by incorporating cell-specific latent time and cell type conditioning to disentangle temporal asynchrony and trajectories to different cell types. We demonstrate state-of-the-art performance on complex interpolation and extrapolation benchmarks and introduce a novel gradient-based strategy for inferring perturbation trajectories.

Subjects: Machine Learning , Genomics

Publish: 2026-05-20 10:24:51 UTC


#9 Training distribution determines the ceiling of drug-blind cancer sensitivity prediction [PDF] [Copy] [Kimi] [REL]

Author: Taekyung Heo

Precision oncology requires predicting which drugs will suppress a specific tumor from its molecular profile, but drug-blind sensitivity prediction has plateaued despite increasingly complex drug representations. Here we show that this stagnation reflects a metric artifact rather than a representational bottleneck. The standard benchmark, global Pearson r, is dominated by between-drug potency differences that a trivial drug-mean predictor captures without any cell-specific learning. Per-drug Pearson r, which isolates within-drug cell ranking, reveals that no drug encoding improves over cell-only features across four independent datasets. A controlled experiment channeling mechanism-of-action identity as either a drug feature or a training-distribution constraint identifies the cause. Supplying MoA as a feature yields negligible benefit, whereas using it to stratify training raises per-drug r substantially for targeted kinase inhibitors, because pan-cancer co-training suppresses pathway-specific sensitivity signals. Mechanism-stratified training and response matching from pilot observations provide two deployable strategies that together recover the principal sources of predictive gain in drug-blind sensitivity prediction.

Subjects: Machine Learning , Quantitative Methods

Publish: 2026-05-20 08:24:56 UTC


#10 Inferring infectiousness: a joint model of the within-host viral kinetics of SARS-CoV-2 [PDF] [Copy] [Kimi] [REL]

Authors: Christopher B. Boyer, Stephen M. Kissler, Seran Hakki, Jakob Jonnerby, Ajit Lalvani, Marc Lipsitch

During an infectious disease outbreak, providing accurate answers to policy questions about transmission requires a detailed model of the natural history of infectiousness. Unfortunately, direct measures of infectiousness are generally unavailable. Instead, we often rely on indirect proxies, such as viral load measured by PCR or antigen tests, viral culture to detect replication-competent virus, or symptom onset, each of which reflects different aspects of viral dynamics or host response. However, these proxies vary in terms of the ease of collection, scalability, and their relationship to viral shedding and therefore underlying infectiousness. Here, we use data from five prospective, densely sampled cohorts with longitudinal data on multiple proxies of viral shedding for approximately 2,000 infections to develop a Bayesian joint model for the within-host viral kinetics of SARS-CoV-2 infection. Modeling the joint distribution allows us to infer the trajectory of infectious virus shedding -- the most direct correlate of infectiousness -- for individuals who contribute only PCR data, and to compute derived quantities that are inaccessible from any single proxy alone. These include the population-level probability and expected duration of ongoing infectiousness as a function of time since diagnosis, stratified by variant, vaccination status, and infection history; the residual risk of releasing an individual from isolation; and personalized, real-time estimates of infectiousness that are sequentially updated as new test results become available.

Subjects: Methodology , Populations and Evolution , Quantitative Methods , Applications

Publish: 2026-05-20 04:39:53 UTC


#11 Machine-Learning-Enhanced Non-Invasive Testing for MASLD Fibrosis: Shallow-Deep Neural Networks Versus FIB-4, Tabular Foundation Models, and Large Language Models [PDF] [Copy] [Kimi] [REL]

Authors: Athanasios Angelakis, Gabriele De Vito, Eleni-Myrto Trifylli, Filomena Ferrucci

Advanced fibrosis is a major determinant of liver-related morbidity in metabolic dysfunction-associated steatotic liver disease (MASLD). FIB-4 is widely used as a first-line non-invasive test, but its fixed formula may underuse diagnostic information contained in age, aspartate aminotransferase, alanine aminotransferase, and platelet count. We evaluated whether machine-learning-enhanced non-invasive testing (MLE-NIT) can improve advanced fibrosis detection while preserving this FIB-4 variable space. We used three biopsy-confirmed MASLD cohorts from China, Malaysia, and India (n=784). The Chinese cohort was split into 486 training and 54 internal validation/tuning patients; final performance was reported only on the Malaysian and Indian external cohorts. Models used five variables: age, FIB-4, aspartate aminotransferase, platelet count, and alanine aminotransferase. We compared FIB-4 with a shallow-deep neural network (s-DNN), TabPFN, and gpt-4o-2024-08-06. FIB-4 achieved external ROC-AUCs of 0.75 and 0.60 in Malaysia and India, respectively. TabPFN achieved 0.69 and 0.66, fine-tuned GPT-4o achieved 0.75 and 0.63, and the s-DNN achieved 0.77 and 0.67, respectively. The s-DNN contained only 354 trainable parameters, compared with 7,244,554 for TabPFN, yet provided a more balanced external operating profile. Calibration showed s-DNN Brier scores of 0.18 and 0.22, and permutation importance identified AST and FIB-4 as dominant variables. Compact non-linear MLE-NITs may enhance FIB-4-based fibrosis assessment without increasing clinical data requirements.

Subjects: Machine Learning , Artificial Intelligence , Quantitative Methods

Publish: 2026-05-19 21:51:02 UTC


#12 Sparse Contextual Coupling Reshapes Diffusion Geometry in Multilayer Hypergraphs [PDF] [Copy] [Kimi] [REL]

Authors: Hao Ding, Sanjukta Krishnagopal

Many complex systems combine dense background structure with sparse contextual information. We introduce a diffusion-based framework for analyzing how sparse condition-specific layers reshape diffusion geometry in multilayer hypergraphs. Each layer is represented as a weighted hypergraph, layers are coupled through shared entities, and random walks on the coupled system induce multiscale diffusion distances between nodes. We apply the framework to disease-conditioned gene networks by coupling a dense MSigDB functional gene-set layer to sparse disease-specific DGIdb drug-gene hypergraphs, with disease-associated drugs selected from DDDB and HumanNet-GSP used to define external gene weights. Across Bipolar Disorder, Schizophrenia, Leukemia, and Breast Cancer, the disease-specific layer contains less than 2 percent of genes in the coupled system, yet substantially changes diffusion distances and community structure. Centrality analysis suggests that this disproportionate effect arises because DGIdb-associated genes occupy influential positions in the MSigDB-derived functional network. The resulting diffusion-derived communities are stable under subsampling and show coherent post hoc functional enrichment, including signaling and neurotransmission categories in neuropsychiatric diseases and immune, translational, and metabolic categories in cancer-associated diseases. Community-level comparisons further reveal disease similarities not reducible to direct DGIdb gene overlap, including a Breast Cancer-Schizophrenia relationship consistent with recent biomedical evidence. These results show that sparse contextual layers can induce interpretable nonlocal changes in higher-order network geometry.

Subjects: Physics and Society , Social and Information Networks , Quantitative Methods

Publish: 2026-05-19 20:06:54 UTC


#13 Artificial Pancreas Implantables -- How Healthcare Professionals May Deal With DIY Bio Cases [PDF] [Copy] [Kimi] [REL]

Authors: Austin James, Xavier-Lewis Palmer, Lucas Potter, Celisha Oscar

Automated insulin delivery (AID) and artificial pancreas systems increasingly serve as safety-critical cyber-physical technologies in clinical care, integrating sensors, algorithms, software, and insulin-delivery hardware to automate a life-sustaining therapy. While regulated commercial systems are supported by formal approval pathways, manufacturer governance, and post-market surveillance, clinicians are also encountering patients who rely on do-it-yourself (DIY) artificial pancreas systems that operate outside conventional regulatory and institutional control structures. This paper examines how routine clinical handling practices intersect with cyberbiosecurity risk across both regulated and DIY AID systems. When insulin delivery systems are fundamentally reconfigured into a bespoke AID system, with the patient-user becoming the primary threat vector by assuming manufacturer-level roles without mandated governance, the entire ecosystem of stakeholders is placed in legal and clinical uncertainty.

Subjects: Cryptography and Security , Computers and Society , Tissues and Organs

Publish: 2026-04-11 18:57:58 UTC


#14 Perceptual misalignment of texture representations in convolutional neural networks [PDF] [Copy] [Kimi1] [REL]

Authors: Ludovica de Paolis, Fabio Anselmi, Alessio Ansuini, Eugenio Piasini

Mathematical modeling of visual textures traces back to Julesz's intuition that texture perception in humans is based on local correlations between image features. An influential approach for texture analysis and generation generalizes this notion to linear correlations between the nonlinear features computed by convolutional neural networks (CNNs), compiled into Gram matrices. Given that CNNs are often used as models for the visual system, it is natural to ask whether such "texture representations" spontaneously align with the textures' perceptual content, and in particular whether those CNNs that are regarded as better models for the visual system also possess more human-like texture representations. Here we compare the perceptual content captured by feature correlations computed for a diverse pool of CNNs, and we compare it to the models' perceptual alignment with the mammalian visual system as measured by Brain-Score. Surprisingly, we find that there is no connection between conventional measures of CNN quality as a model of the visual system and its alignment with human texture perception. We conclude that texture perception involves mechanisms that are distinct from those that are commonly modeled using approaches based on CNNs trained on object recognition, possibly depending on the integration of contextual information.

Subject: Computer Vision and Pattern Recognition

Publish: 2026-04-01 19:51:45 UTC