Total: 1
We explore language-agnostic deep text embeddings for severity classification of dysarthria in Amyotrophic Lateral Sclerosis (ALS). Speech recordings are transcribed by human and ASR and embeddings of the transcripts are considered. Though speech recognition accuracy has been studied for grading dysarthria severity, no effort has yet been made to utilize text embeddings of the transcripts. We perform severity classification at different granularity (2, 3, and 5-class) using data obtained from 47 ALS subjects. Experiments with dense neural network based classifiers suggest that, though text features achieve nearly equal performances as baseline speech features, like statistics of mel frequency cepstral coefficients (MFCC), for 2-class classification, speech features outperform for higher number of classes. Concatenation of text embeddings and MFCC statistics attains the best performances with mean F1 scores of 88%, 68%, and 53%, respectively, in 2, 3, and 5-class classification.