Total: 1
To compare the performance of two speech generation sys- tems, one of the most effective approaches is estimating the preference score between their generated speech. This pa- per proposes a novel universal preference-score-based pairwise speech quality assessment (UPPSQA) model, aimed at predict- ing the preference score between paired speech samples to de- termine which one has better quality. The model first predicts the absolute mean opinion score (MOS) for the two speech sam- ples separately, and then aggregates them into a relative prefer- ence score using a preference function. To address the scarcity of preference data, we also construct a new pairwise speech dataset based on a MOS dataset for experiments. Experimental results confirm that, whether in training scenarios with differ- ent data types and label conditions, or in both in-domain and out-of-domain test scenarios, the prediction accuracy of UPP- SQA outperforms that of the baseline models, demonstrating its universality.