2026-05-18 | | Total: 9
Extreme heat events are increasing in frequency and intensity under climate change, but the socio-behavioral mechanisms that shape community resilience remain insufficiently understood. This study uses a Large Language Model-enhanced agent-based model to simulate responses to a prolonged heatwave in a virtual society. One hundred heterogeneous agents were assigned a Heat Vulnerability Index based on demographic risk factors and observed over 13 simulated days covering baseline, heatwave, and recovery periods. The simulation shows that heat-related impacts are primarily psychosocial and unequally distributed. Agents with higher vulnerability experienced larger declines in perceived safety and social connection than agents with lower vulnerability. Vulnerability also shaped adaptive capacity. More resilient agents maintained routine self-care and protective behaviors, whereas highly vulnerable agents showed behavioral constriction, marked by reduced engagement in protective actions. At the collective level, risk-information diffusion followed a pattern of complex contagion, with adoption driven more by repeated social reinforcement within cohesive networks than by broad exposure alone. These findings suggest that LLM-enhanced simulation can help identify behavioral and social mechanisms of climate resilience and inform heat-risk interventions that combine targeted support for vulnerable groups with community-based information pathways.
Biofilms, bacteria cells surrounded by a self-produced polymeric matrix, are common on medical devices and lead to many hospital infections. The biofilm lifecycle includes disassembly and dispersion, where bacteria clusters detach from the biofilm, circulate in the bloodstream, and potentially colonize secondary infection sites. Existing models often simplify detachment to a function of biofilm thickness or extracellular polymeric substance (EPS) density, without tracking properties of detached clusters that impact their biological fate, including cluster size and morphology. Addressing this gap, our detachment model accounts for drag and adhesion in tagged sections of the biofilm determined by the cluster geometry and local arrangement of bacteria and EPS. A stickiness parameter controls local EPS adhesion strength, which is modulated to disrupt (or compromise) EPS biomass. We specifically model the detachment of clusters from a Staphylococcus epidermidis biofilm grown for 24 hours. Experimental data for biofilm microstructural features are utilized to benchmark the simulated biofilm, which is then subjected to different EPS disruption levels. We examine parameters that influence detached biofilm cell cluster frequency, size, and shape, providing mechanistic insights into how compromised EPS influences detachment dynamics. This integrated modeling framework is a significant advance in the predictive capabilities for biofilm detachment processes.
Biologically-inspired AI agent frameworks claim reliability benefits through structural guarantees adapted from gene regulatory networks, immune systems, and metabolic control. These claims are rarely tested empirically against simpler alternatives. We present three deep benchmarks: metabolic priority gating, autoinducer-based quorum sensing, and Bayesian stagnation detection, each comparing a biologically-grounded implementation against a naive non-biological alternative and an ablated control, across 1,000 trials per seed and 10 seeds (10M+ data points total).
Clustered dental data commonly arise when multiple teeth or tooth sites are observed within the same individual. In such settings, the number of observed units within a cluster may be informative, since tooth loss and missing measurements often reflect underlying oral health status. Standard marginal association measures may therefore be biased when larger or smaller clusters contribute disproportionate information. This paper develops weighted estimators for marginal association between paired tooth-level outcomes in the presence of informative cluster size and informative within-cluster subgroup structure. The proposed approach extends the logic of within-cluster resampling and cluster-weighted estimating equations to paired bivariate outcomes by constructing weights that balance contributions across clusters, observed marginal categories, and observed paired categories. Weighted estimating equations are used to estimate moment, rank, and cell-probability functionals, yielding clustered-data analogues of Pearson, Spearman, and phi association measures. Sandwich variance estimators and delta-method standard errors are derived for inference. Simulation studies assess finite-sample bias, standard error estimation, and coverage under varying sources of cluster-level and unit-level dependence, as well as outcome-dependent observation mechanisms. The methods are illustrated using tooth-level periodontal and caries outcomes from NHANES, where informative subgroup-size diagnostics indicate that the observed distribution of disease severity is not independent of within-mouth structure. The proposed estimators provide a principled basis for estimating marginal oral-health associations for a typical tooth from a typical individual, while reducing bias induced by informative tooth retention and subgroup composition.
Predicting drug-induced cellular state changes at single-cell resolution remains a central challenge in virtual cell modeling, particularly under out-of-distribution (OOD) conditions. Current approaches predominantly rely on RNA-based assays, which often fail to adequately capture the diverse cellular states underlying drug responses. Moreover, conditional distribution shifts and low signal-to-noise ratios frequently cause models to learn spurious correlations rather than genuine state transitions. To address these limitations, we introduce StateXDiff, a cell State-contextualized multimodal (X) Diffusion framework for predicting single-cell responses to drug perturbations. The framework operates sequentially: first, it learns a disentangled, multimodal representation of cellular state by integrating transcriptomic profiles with inferred protein features; second, it employs a conditional diffusion model to generate perturbation-specific changes. Our approach introduces a Virtual Multimodal Cell State, which augments RNA-based representations with protein-level context, and a Mechanism-aware Drug-Gene Template, which consolidates multi-source biological knowledge for accurate drug representation. Generation is driven by a latent-space diffusion Transformer, regularized through quality-aware triplet constraints, including positive drug-protein pairs or protein-drug mismatched pairs, and explicit protein-reliability weighting. Extensive evaluation demonstrates that StateXDiff consistently enhances generalization performance across three challenging settings: unseen cell lines, unseen drugs, and combinatorial perturbations.
Turing patterns are a cornerstone of biological self-organization, yet their emergence typically requires finely tuned parameters occupying narrow regions of high-dimensional space. This poses a fundamental challenge: how can evolving biological systems reliably find and exploit such rare conditions? In this work, we propose that common biochemical limit cycles, such as those arising from genetic feedback loops, can act as natural explorers of Turing space. By coupling a reaction-diffusion system to an orbit that modulates some of its parameters, we show that the system can dynamically sweep through Turing-permissive regimes and generate transient spatial patterns. We use an entropy-based measure in Fourier space to quantify pattern formation and demonstrate how cycles enhance the detectability and robustness of Turing islands. We further explore how coupling to positional gradients increases reproducibility, suggesting a route from oscillatory dynamics to stable developmental programs. Our results highlight a powerful mechanism by which nature might bootstrap complex spatial structure from simple temporal motifs.
Online patient inquiries are often informal, incomplete, and written before professional assessment, yet they must still be routed to an appropriate level of clinical follow-up. We study this as a four-class actionable triage task -- self-care, schedule-visit, urgent-clinician-review, or emergency-referral, and ask whether prompted large language models (LLMs) can support such routing under low-resource labeling conditions. Using the public HealthCareMagic-100K corpus, we construct a 300-example human calibrated gold evaluation set, a 700-example auto-labeled silver training set, and a 40-example few-shot pool. We compare Term Frequency-Inverse Document Frequency (TF-IDF) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT) baselines train on silver labels against six prompted LLMs under 0-shot, 4-shot, and 12-shot conditions respectively. Accordingly, we evaluate with macro-$F_1$ alongside safety-aware metrics, including emergency-recall, under-triage rate, and severe under-triage rate. The strongest LLM (Claude Haiku 4.5, 12-shot) reaches macro-$F_1$ 0.475, exceeding the best supervised baseline (BioBERT, 0.378) on point estimate, with overlapping confidence intervals. Few-shot prompting and two-model agreement help in label-dependent ways: self-care agreement is reliable, urgent-clinician-review is not. We conclude that LLMs can support triage prioritization and selective human review, but not autonomous deployment.
Inferring the structure of directed acyclic graphs (DAGs) from data is a central challenge in causal discovery, particularly in modern high-dimensional settings where large-scale interventional data are increasingly available. While interventional data can improve identifiability, existing methods remain limited by soft acyclicity constraints, leading to optimization over invalid cyclic graphs, numerical instability, and reduced scalability. We introduce PACER (Perturbation-driven Acyclic Causal Edge Recovery), a scalable framework for causal discovery that guarantees acyclicity by construction. PACER parameterizes a distribution over DAGs through a joint model of variable permutations and edge probabilities, enabling direct optimization over valid causal structures without surrogate penalties. The framework supports a unified likelihood-based treatment of observational and interventional data, flexible conditional density models, and the incorporation of structural prior knowledge. For linear-Gaussian mechanisms, we derive closed-form expressions for the expected interventional log-likelihood and its gradients, yielding substantial computational gains. Empirically, PACER matches or exceeds state-of-the-art methods on protein signaling and large-scale genetic perturbation benchmarks, while scaling efficiently to networks with thousands of variables and achieving up to two orders of magnitude speedups over penalty-based differentiable approaches. These results demonstrate that exact and scalable causal discovery from high-dimensional perturbation data is achievable through principled search space design.
When reliable target structures are unavailable at scale or phenotypes arise from dysregulated pathways, transcriptomic perturbations provide a system-level functional readout for drug action. In this work, we formalize \emph{Transcriptome-based Drug Design (TBDD)} as a generative inverse problem: designing drug molecules conditioned on desired transcriptomic state transitions. We analyze the inherently ill-posed nature of this task, which is further complicated by the profound domain gap between biology and chemistry and by the sparsity of transcriptomic signals. To address these challenges, we propose \textbf{\themodel{}} (A \textbf{C}ell\textbf{U}lar \textbf{R}esponse \textbf{E}ngine), a multi-resolution transcriptome-guided diffusion framework. \themodel{} features a specialized \textbf{Transcriptome Perturbation Functional Feature Extractor (TFE)} that (1) distills function-oriented perturbation embeddings from pre/post states, (2) aligns these signatures to dual chemical views to bridge the cross-modal gap, and (3) performs heterogeneity-aware aggregation to extract robust state-specific signals from noisy transcriptomic data. Extensive evaluations on both standard benchmarks and rigorous out-of-distribution protocols demonstrate that \themodel{} consistently outperforms strong baselines in structural quality and functional consistency. Furthermore, we validate its practical utility via a zero-shot gene-inhibitor design task, highlighting the potential of phenotype-driven generative discovery.