2025-07-31 | | Total: 15
We introduce a probabilistic model of early visual processing, beginning with the interaction between a light wavefront and the retina. We argue that perception originates not with deterministic transduction, but with probabilistic threshold crossings shaped by quantum photon arrival statistics and biological variability. We formalize this with an uncertainty relation, Δα⋅Δt≥η, through the transformation of light into symbolic neural code through the layered retinal architecture. Our model is supported by previous experimental results, which show intrinsic variability in retinal responses even under fixed stimuli. We contrast this with a classical null hypothesis of deterministic encoding and propose experiments to further test our uncertainty relation. By re-framing the retina as a probabilistic measurement device, we lay the foundation for future models of cortical dynamics rooted in quantum-like computation. We are not claiming that the brain could be working as a quantum-system, but rather putting forth the argument that the brain as a classical system could still implement quantum-inspired computations. We define quantum-inspired computation as a scheme that includes both probabilistic and time-sensitive computation, clearly separating it from classically implementable probabilistic systems.
Depression and suicidality profoundly impact cognition and emotion, yet objective neurophysiological biomarkers remain elusive. We investigated the spatiotemporal neural dynamics underlying affective semantic processing in individuals with varying levels of clinical severity of depression and suicidality using multivariate decoding of electroencephalography (EEG) data. Participants (N=137) completed a sentence evaluation task involving emotionally charged self-referential statements while EEG was recorded. We identified robust, neural signatures of semantic processing, with peak decoding accuracy between 300-600 ms -- a window associated with automatic semantic evaluation and conflict monitoring. Compared to healthy controls, individuals with depression and suicidality showed earlier onset, longer duration, and greater amplitude decoding responses, along with broader cross-temporal generalization and increased activation of frontocentral and parietotemporal components. These findings suggest altered sensitivity and impaired disengagement from emotionally salient content in the clinical groups, advancing our understanding of the neurocognitive basis of mental health and providing a principled basis for developing reliable EEG-based biomarkers of depression and suicidality.
Explaining the emergence of self-organized biodiversity and species abundance distribution patterns remians a fundamental challenge in ecology. While classical frameworks, such as neutral theory and models based on pairwise species interactions, have provided valuable insights, they often neglect higher-order interactions (HOIs), whose role in stabilizing ecological communities is increasingly recognized. Here, we extend the Generalized Lotka-Volterra framework to incorporate HOIs and demonstrate that these interactions can enhance ecosystem stability and prevent collapse. Our model exhibits a diverse range of emergent dynamics, including self-sustained oscillations, quasi-periodic (torus) trajectories, and intermittent chaos. Remarkably, it also reproduces empirical species abundance distributions observed across diverse natural communities. These results underscore the critical role of HOIs in structuring biodiversity and offer a broadly applicable theoretical framework for capturing complexity in ecological systems
Biological systems commonly exhibit complex spatiotemporal patterns whose underlying generative mechanisms pose a significant analytical challenge. Traditional approaches to spatiodynamic inference rely on dimensionality reduction through summary statistics, which sacrifice complexity and interdependent structure intrinsic to these data in favor of parameter identifiability. This imposes a fundamental constraint on reliably extracting mechanistic insights from spatiotemporal data, highlighting the need for analytical frameworks that preserve the full richness of these dynamical systems. To address this, we developed a simulation-based inference framework that employs vision transformer-driven variational encoding to generate compact representations of the data, exploiting the inherent contextual dependencies. These representations are subsequently integrated into a likelihood-free Bayesian approach for parameter inference. The central idea is to construct a fine-grained, structured mesh of latent representations from simulated dynamics through systematic exploration of the parameter space. This encoded mesh of latent embeddings then serves as a reference map for retrieving parameter values that correspond to observed data. By integrating generative modeling with Bayesian principles, our approach provides a unified inference framework to identify both spatial and temporal patterns that manifest in multivariate dynamical systems.
A common approach in neuroscience is to study neural representations as a means to understand a system -- increasingly, by relating the neural representations to the internal representations learned by computational models. However, a recent work in machine learning (Lampinen, 2024) shows that learned feature representations may be biased to over-represent certain features, and represent others more weakly and less-consistently. For example, simple (linear) features may be more strongly and more consistently represented than complex (highly nonlinear) features. These biases could pose challenges for achieving full understanding of a system through representational analysis. In this perspective, we illustrate these challenges -- showing how feature representation biases can lead to strongly biased inferences from common analyses like PCA, regression, and RSA. We also present homomorphic encryption as a simple case study of the potential for strong dissociation between patterns of representation and computation. We discuss the implications of these results for representational comparisons between systems, and for neuroscience more generally.
Spatial transcriptomics (ST) technologies not only offer an unprecedented opportunity to interrogate intact biological samples in a spatially informed manner, but also set the stage for integration with other imaging-based modalities. However, how to best exploit spatial context and integrate ST with imaging-based modalities remains an open question. To address this, particularly under real-world experimental constraints such as limited dataset size, class imbalance, and bounding-box-based segmentation, we used a publicly available murine ileum MERFISH dataset to evaluate whether a minimally tuned variational autoencoder (VAE) could extract informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or a combination thereof. We assessed the resulting embeddings through PERMANOVA, cross-validated classification, and unsupervised Leiden clustering, and compared them to classical image-based feature vectors extracted via CellProfiler. While transcript counts (TC) generally outperformed other feature spaces, the VAE-derived latent spaces (LSs) captured meaningful biological variation and enabled improved label recovery for specific cell types. LS2, in particular, trained solely on morphological input, also exhibited moderate predictive power for a handful of genes in a ridge regression model. Notably, combining TC with LSs through multiplex clustering led to consistent gains in cluster homogeneity, a trend that also held when augmenting only subsets of TC with the stain-derived LS2. In contrast, CellProfiler-derived features underperformed relative to LSs, highlighting the advantage of learned representations over hand-crafted features. Collectively, these findings demonstrate that even under constrained conditions, VAEs can extract biologically meaningful signals from imaging data and constitute a promising strategy for multi-modal integration.
Data in biology is redundant, noisy, and sparse. How does the type and scale of available data impact model performance? In this work, we specifically investigate how protein language models (pLMs) scale with increasing pretraining data. We investigate this relationship by measuring the performance of protein function prediction on a suite of pLMs pretrained on yearly snapshots of UniRef100 from 2011 to 2024. We find no evidence of model saturation on this task: performance improves--but not monotonically--with added data, and this trend differs between unsupervised and supervised experiments. Using a well-characterized Beta-Lactamase protein from E. coli, we find that unsupervised model predictions get better year-over-year, though they do not yet consistently perform better than the supervised baseline. Our results underscore the need for targeted data acquisition and deeper study of data scaling in protein modeling. All training, inference, analysis, and visualization code is available at: https://github.com/Align-to-Innovate/data-saturation-and-scaling.
Computational pathology (CPath) has shown great potential in mining actionable insights from Whole Slide Images (WSIs). Deep Learning (DL) has been at the center of modern CPath, and while it delivers unprecedented performance, it is also known that DL may be affected by irrelevant details, such as those introduced during scanning by different commercially available scanners. This may lead to scanner bias, where the model outputs for the same tissue acquired by different scanners may vary. In turn, it hinders the trust of clinicians in CPath-based tools and their deployment in real-world clinical practices. Recent pathology Foundation Models (FMs) promise to provide better domain generalization capabilities. In this paper, we benchmark FMs using a multi-scanner dataset and show that FMs still suffer from scanner bias. Following this observation, we propose ScanGen, a contrastive loss function applied during task-specific fine-tuning that mitigates scanner bias, thereby enhancing the models' robustness to scanner variations. Our approach is applied to the Multiple Instance Learning task of Epidermal Growth Factor Receptor (EGFR) mutation prediction from H\&E-stained WSIs in lung cancer. We observe that ScanGen notably enhances the ability to generalize across scanners, while retaining or improving the performance of EGFR mutation prediction.
Background: Symptom rating scales in psychiatry are limited by reliance on self-report, and lack of predictive power. Actigraphy, a passive wearable-based method for measuring sleep and physical activity, offers objective, high-resolution behavioral data that may better reflect symptom fluctuations, but most studies have focused on narrow diagnostic groups or fixed time windows, limiting clinical translation. Objective: To examine whether actigraphy-derived sleep and activity features correlate with psychiatric symptom severity in a transdiagnostic psychiatric sample, and to identify which features are most clinically relevant across multiple temporal resolutions. Methods: We present a feasibility case series study with preliminary data from eight outpatients enrolled in the DeeP-DD study, a transdiagnostic study of digital phenotyping. Participants wore GENEActiv actigraphy devices and symptom severity was measured using a variety of validated scales. We performed intra-individual Spearman correlations and inter-individual repeated measures correlations across daily, weekly, monthly, and full-duration averages. Results: Intra-individual analyses revealed that later rise times were significantly associated with higher weekly PHQ-9 scores in participant #7 (\r{ho} = 0.74, P=.0003) and participant #4 (\r{ho} = 0.78, P=.022), as well as higher weekly GAD-7 scores in participant #7 (\r{ho} = 0.59, P=.026). Inter-individual analyses showed that weeks with later average rise time correlated with higher PHQ-9 (r = 0.48, P=.0003) and GAD-7 scores (r = 0.38, P=.032). Increased light physical activity was linked to lower PHQ-9 scores weekly (r = -0.44, P=.001) and monthly (r = -0.53, P=.014). Conclusion: Consistent associations between actigraphy features and symptoms across temporal scales and diagnostic groups underscore their potential utility for scalable, real-world clinical monitoring.
Effective public health decisions require early reliable inference of the infectious disease properties. In this paper we assess the ability to infer infectious disease attributes from population-level stochastic epidemic trajectories. In particular, we construct stochastic Kermack-McKendrick model trajectories, sample them with and without measurement error, and evaluate inversions for the population mean infectiousness as a function of time since infection, the infection duration distribution, and its complementary cumulative distribution, the infection survival distribution. Based on an integro-differential equation formulation we employ a natural regression approach to fit the corresponding integral kernels and show that these disease attributes are recoverable from both un-regularized multi-trajectory inversions and regularized single trajectory inversions. Moreover, we demonstrate that the infection duration distributions (or alternatively the infection survival distributions) and population mean infectiousness kernels recovered can be used to solve for the individual infectiousness profile, the infectiousness of an individual over the duration of their infection. The work suggests that, aggressive monitoring of the stochastic evolution of a novel infectious disease outbreak in a single local well-mixed population can allow determination of the underlying disease attributes that characterize its spread.
This work presents BioPykrete, a new sustainable bio-composite material created from ice, nano-crystalline cellulose (CNC), and a tailor-made chimera protein designed to bind the two together. We developed and produced the chimera protein by linking AFPIII, an ice-binding protein, with CBM3a, a CNC-binding protein. As the suspension freezes, the CNC chains self-organize into a reinforcing network between the ice crystals. This structural enhancement limits crack propagation to typical pore sizes, allowing BioPykrete to avoid the brittle and sudden failure commonly associated with ice. Instead, it exhibits an elastic-like response to stress, making it suitable for construction and engineering applications. With compressive strength comparable with concrete, BioPykrete offers a sustainable and biodegradable alternative to construction materials suitable for the harsh arctic regions of the world where traditional methods are ineffective, and resources are scarce. Engineering chimera proteins with specific affinity to more than a single material type may help improve or tailor the properties of other composite materials.
Human memory exhibits significant vulnerability in cognitive tasks and daily life. Comparisons between visual working memory and new perceptual input (e.g., during cognitive tasks) can lead to unintended memory distortions. Previous studies have reported systematic memory distortions after perceptual comparison, but understanding how perceptual comparison affects memory distortions in real-world objects remains a challenge. Furthermore, identifying what visual features contribute to memory vulnerability presents a novel research question. Here, we propose a novel AI-driven framework that generates naturalistic visual stimuli grounded in behaviorally relevant object dimensions to elicit similarity-induced memory biases. We use two types of stimuli -- image wheels created through dimension editing and dimension wheels generated by dimension activation values -- in three visual working memory (VWM) experiments. These experiments assess memory distortions under three conditions: no perceptual comparison, perceptual comparison with image wheels, and perceptual comparison with dimension wheels. The results show that similar dimensions, like similar images, can also induce memory distortions. Specifically, visual dimensions are more prone to distortion than semantic dimensions, indicating that the object dimensions of naturalistic visual stimuli play a significant role in the vulnerability of memory.
Chimeric antigen receptor (CAR) T-cells are T-cells engineered to recognize and kill specific tumor cells. Through their extracellular domains, CAR T-cells bind tumor cell antigens which triggers CAR T activation and proliferation. These processes are regulated by co-stimulatory domains present in the intracellular region of the CAR T-cell. Through integrating novel signaling components into the co-stimulatory domains, it is possible to modify CAR T-cell phenotype. Identifying and experimentally testing new CAR constructs based on libraries of co-stimulatory domains is nontrivial given the vast combinatorial space defined by such libraries. This leads to a highly data constrained, poorly explored combinatorial problem, where the experiments undersample all possible combinations. We propose a quantum approach using a Projected Quantum Kernel (PQK) to address this challenge. PQK operates by embedding classical data into a high dimensional Hilbert space and employs a kernel method to measure sample similarity. Using 61 qubits on a gate-based quantum computer, we demonstrate the largest PQK application to date and an enhancement in the classification performance over purely classical machine learning methods for CAR T cytotoxicity prediction. Importantly, we show improved learning for specific signaling domains and domain positions, particularly where there was lower information highlighting the potential for quantum computing in data-constrained problems.
The positioning of new cellular walls during cell division plays a key role in shaping plant tissue organization. The influence of cell geometry on the positioning of division planes has been previously captured into various geometrical rules. Accordingly, linking cell shape to division orientation has relied on the comparison between observed division patterns and predictions under specific rules. The need to define a priori the tested rules is a fundamental limitation of this hypothesis-driven approach. As an alternative, we introduce a data-based approach to investigate the relation between cell geometry and division plane positioning, exploiting the ability of deep neural network to learn complex relationships across multidimensional spaces. Adopting an image-based cell representation, we show how division patterns can be learned and predicted from mother cell geometry using a UNet architecture modified to operate on cell masks. Using synthetic data and A. thaliana embryo cells, we evaluate the model performances on a wide range of diverse cell shapes and division patterns. We find that the trained model accounted for embryo division patterns that were previously irreconcilable under existing geometrical rules. Our work shows the potential of deep networks to understand cell division patterns and to generate new hypotheses on the control of cell division positioning.
We propose a biologically inspired model of spiking neurons based on the dynamics of a damped, driven pendulum. Unlike traditional models such as the Leaky Integrate-and-Fire (LIF) neurons, the pendulum neuron incorporates second-order, nonlinear dynamics that naturally give rise to oscillatory behavior and phase-based spike encoding. This model captures richer temporal features and supports timing-sensitive computations critical for sequence processing and symbolic learning. We present an analysis of single-neuron dynamics and extend the model to multi-neuron layers governed by Spike-Timing Dependent Plasticity (STDP) learning rules. We demonstrate practical implementation with python code and with the Brian2 spiking neural simulator, and outline a methodology for deploying the model on neuromorphic hardware platforms, using an approximation of the second-order equations. This framework offers a foundation for developing energy-efficient neural systems for neuromorphic computing and sequential cognition tasks.