rXEwxmnGQs@OpenReview

Total: 1

#1 PhonATe: Impact of Type-Written Phonological Features of African American Language on Generative Language Modeling Tasks [PDF] [Copy] [Kimi2] [REL]

Authors: Nicholas Deas ; Jessica A Grieser ; Xinmeng Hou ; Shana Kleiner ; Tajh Martin ; Sreya Nandanampati ; Desmond U. Patton ; Kathleen McKeown

Current Large Language Models perform poorly on African American Language (AAL) texts in tasks like toxicity detection and sentiment analysis. AAL is underrepresented in both pre-training data and existing benchmarks for these tasks, hindering thorough evaluation and understanding of these biases. We introduce a novel approach to synthetically introduce type-written phonological features of AAL into text, a class of AAL features that has been overlooked in prior work. Our goal is to better understand how these features affect generative language models' performance on three tasks: toxicity detection, sentiment analysis, and masked span prediction. We find that fine-tuning with synthetic type-written phonological features lowers perceived biases on downstream tasks and our ablations reveal which features have particularly large negative impacts on model performance. Our results suggest that phonological features are vital to consider when designing bias mitigation techniques.