2025-03-06 | | Total: 5
There is growing recognition in both the experimental and modelling literature of the importance of spatial structure to the dynamics of viral infections in tissues. Aided by the evolution of computing power and motivated by recent biological insights, there has been an explosion of new, spatially-explicit models for within-host viral dynamics in recent years. This development has only been accelerated in the wake of the COVID-19 pandemic. Spatially-structured models offer improved biological realism and can account for dynamics which cannot be well-described by conventional, mean-field approaches. However, despite their growing popularity, spatially-structured models of viral dynamics are underused in biological applications. One major obstacle to the wider application of such models is the huge variety in approaches taken, with little consensus as to which features should be included and how they should be implemented for a given biological context. Previous reviews of the field have focused on specific modelling frameworks or on models for particular viral species. Here, we instead apply a scoping review approach to the literature of spatially-structured viral dynamics models as a whole to provide an exhaustive update of the state of the field. Our analysis is structured along two axes, methodology and viral species, in order to examine the breadth of techniques used and the requirements of different biological applications. We then discuss the contributions of mathematical and computational modelling to our understanding of key spatially-structured aspects of viral dynamics, and suggest key themes for future model development to improve robustness and biological utility.
Understanding the biological mechanism of disease is critical for medicine, and in particular drug discovery. AI-powered analysis of genome-scale biological data hold great potential in this regard. The increasing availability of single-cell RNA sequencing data has enabled the development of large foundation models for disease biology. However, existing foundation models either do not improve or only modestly improve over task-specific models in downstream applications. Here, we explored two avenues for improving the state-of-the-art. First, we scaled the pre-training dataset to 116 million cells, which is larger than those used by previous models. Second, we leveraged the availability of large-scale biological annotations as a form of supervision during pre-training. We trained the TEDDY family of models comprising six transformer-based state-of-the-art single-cell foundation models with 70 million, 160 million, and 400 million parameters. We vetted our models on two downstream evaluation tasks -- identifying the underlying disease state of held-out donors not seen during training and distinguishing healthy cells from diseased ones for disease conditions and donors not seen during training. Scaling experiments showed that performance improved predictably with both data volume and parameter count. Our models showed substantial improvement over existing work on the first task and more muted improvements on the second.
Pathological diagnosis plays a critical role in clinical practice, where the whole slide images (WSIs) are widely applied. Through a two-stage paradigm, recent deep learning approaches enhance the WSI analysis with tile-level feature extracting and slide-level feature modeling. Current Transformer models achieved improvement in the efficiency and accuracy to previous multiple instance learning based approaches. However, three core limitations persist, as they do not: (1) robustly address the modeling on variable scales for different slides, (2) effectively balance model complexity and data availability, and (3) balance training efficiency and inference performance. To explicitly address them, we propose a novel model for slide modeling, PathRWKV. Via a recurrent structure, we enable the model for dynamic perceptible tiles in slide-level modeling, which novelly enables the prediction on all tiles in the inference stage. Moreover, we employ linear attention instead of conventional matrix multiplication attention to reduce model complexity and overfitting problem. Lastly, we hinge multi-task learning to enable modeling on versatile tasks simultaneously, improving training efficiency, and asynchronous structure design to draw an effective conclusion on all tiles during inference, enhancing inference performance. Experimental results suggest that PathRWKV outperforms the current state-of-the-art methods in various downstream tasks on multiple datasets. The code and datasets are publicly available.
Pathology image analysis plays a pivotal role in medical diagnosis, with deep learning techniques significantly advancing diagnostic accuracy and research. While numerous studies have been conducted to address specific pathological tasks, the lack of standardization in pre-processing methods and model/database architectures complicates fair comparisons across different approaches. This highlights the need for a unified pipeline and comprehensive benchmarks to enable consistent evaluation and accelerate research progress. In this paper, we present UnPuzzle, a novel and unified framework for pathological AI research that covers a broad range of pathology tasks with benchmark results. From high-level to low-level, upstream to downstream tasks, UnPuzzle offers a modular pipeline that encompasses data pre-processing, model composition,taskconfiguration,andexperimentconduction.Specifically, it facilitates efficient benchmarking for both Whole Slide Images (WSIs) and Region of Interest (ROI) tasks. Moreover, the framework supports variouslearningparadigms,includingself-supervisedlearning,multi-task learning,andmulti-modallearning,enablingcomprehensivedevelopment of pathology AI models. Through extensive benchmarking across multiple datasets, we demonstrate the effectiveness of UnPuzzle in streamlining pathology AI research and promoting reproducibility. We envision UnPuzzle as a cornerstone for future advancements in pathology AI, providing a more accessible, transparent, and standardized approach to model evaluation. The UnPuzzle repository is publicly available at https://github.com/Puzzle-AI/UnPuzzle.
Groups of cells, including clusters of cancerous cells, multicellular organisms, and developing organs, may both grow and break apart. What physical factors control these fractures? In these processes, what sets the eventual size of clusters? We develop a framework for understanding cell clusters that can fragment due to cell motility using an active particle model. We compute analytically how the break rate of cell-cell junctions depends on cell speed, cell persistence, and cell-cell junction properties. Next, we find the cluster size distributions, which differ depending on whether all cells can divide or only the cells on the edge of the cluster divide. Cluster size distributions depend solely on the ratio of the break rate to the growth rate - allowing us to predict how cluster size and variability depend on cell motility and cell-cell mechanics. Our results suggest that organisms can achieve better size control when cell division is restricted to the cluster boundaries or when fracture can be localized to the cluster center. Our results link the general physics problem of a collective active escape over a barrier to size control, providing a quantitative measure of how motility can regulate organ or organism size.