Total: 1
Somatic structural variations (Somatic SVs) are critical genomic alterations with significant implications in cancer genomics. Although long-read sequencing (LRS) theoretically provides optimal resolution for detecting these variants due to its ability to span large genomic segments, current LRS - based methods, which are derived from short - read - based somatic SV detection algorithms, mainly rely on split - read information. The high error rate of long - read sequencing and the errors introduced by the seed-and-chaining strategy of mainstream alignment algorithms affect the accuracy of these split-reads, making precise detection of somatic SVs still a challenge. To address this issue, we propose the TDScope algorithm, which uses the complete sequence information of local genomic regions provided by long-read sequencing to construct a local graph genome and combines random forest technology to achieve precise detection of somatic structural variations. TDScope outperforms state-of-the-art somatic SV detection methods on paired long-read whole-genome sequencing (WGS) benchmark cell lines, with an average F1-score improvement of 20%. It also demonstrates superior performance in detecting somatic SVs and resolving heterogeneous genomes in tandem repeat-like simulated somatic SV datasets. We also provide the ScopeVIZ tool to offer users visualization evidence of local graph genomes and somatic SV sequences. All code implementations are publicly available on GitHub (https://github.com/Goatofmountain/TDScope).