
A significant percentage of patients speak a dialect other than Standard English, yet these dialects are underrepresented in datasets used for medical natural language processing. In this study, dialect-specific automatic speech recognition models were trained on African American Vernacular English and Standard English utterances. Using the dialect-specific models improved the accuracy of transcription for both dialects.

Learning Objective: 1. Understand the biases in common speech corpora towards Standard American English
2. Learn how to improve the performance of patient-centered natural language applications for other dialects


Rachel Dorn (Presenter)
Virginia Commonwealth University

Scott Vrana, Virginia Commonwealth University
Bridget McInnes, Virginia Commonwealth University

Presentation Materials:
