chi24@interspeech_2024@ISCA

Total: 1

#1 Characterizing code-switching: Applying Linguistic Principles for Metric Assessment and Development [PDF] [Copy] [Kimi] [REL]

Authors: Jie Chi ; Electra Wallington ; Peter Bell

With handling code-switching becoming an increasingly important topic in speech technology, driven by the expansion of low-resource and multilingual methodologies, it is vital that we recognize the diversity of code-switching as a phenomenon. We propose a framework that leverages linguistic findings as makeshift ground-truths to assess the quality and sufficiency of existing metrics designed to capture data-sets' differing code-switching styles. We also introduce a new metric, T-index, which leverages machine translation systems to capture properties of code-switched words in relation to the participating language pair. Through analysis of diverse Hindi-English and Mandarin-English datasets, we systematically explore how well these metrics align with linguistic intuition regarding code-switching richness levels in conversational versus technical domains.