2410.10508

Total: 1

#1 Everyday Speech in the Indian Subcontinent [PDF] [Copy] [Kimi] [REL]

Author: Utkarsh P

India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End-to-End (E2E) framework for multilingual synthesis. The Indian language text is first converted to CLS. This approach enables seamless code switching across 13 Indian languages and English in a given native speaker's voice, which corresponds to everyday speech in the Indian subcontinent, where the population is multilingual.

Subjects: Computation and Language , Sound , Audio and Speech Processing

Publish: 2024-10-14 13:48:36 UTC