2606.11276

Total: 1

#1 A mathematical framework for centromere-aware evaluation of human genome assemblies [PDF] [Copy] [Kimi] [REL]

Authors: Luca Franco, Matteo Migliarini, Matteo Tommaso Ungaro, Egnald Çela, Luca Corda, Andreas Giannis, Ester Mondelli, Fabio Galasso, Simona Giunta

Accurate evaluation of genome assemblies within highly repetitive regions, such as centromeres, remains a major open challenge in genomics. Conventional benchmarking relies on sequence alignment, which becomes problematic in regions of high homogeneity and divergence. Here, we framed centromere assembly evaluation as a comparative distribution problem in a compact centeny representation by computing genomic distances between functional motifs, rather than relying on nucleotide sequence. Our distribution-based metric assesses agreement between a query and a target chromosome by comparing their centromeric inter-motif distances rendered by KL divergence. When applied genome-wide to currently available human telomere-to-telomere (T2T) genomes, this approach yields an accuracy ranking for the entire assembly and for each individual chromosome. Altogether, we present a rapid and robust scoring system based on genomes numerical rendering of inter-motif distances, that provides a quantitative standard of assembly integrity in repetitive DNA regions and establishes a bona fide framework for chromosome-level genome-to-genome comparison.

Subject: Genomics

Publish: 2026-06-09 10:42:49 UTC