Can AI Understand “Park the Car in Harvard Yard”?
As a Bostonian, I often hear people trying to imitate our native accent with the phrase, “Pahk tha cah in Hahvahd Yahd.” In fact, speaking with this lack of “r” sound is the mark of a true Bostonian.
Comparatively, in a place like Dallas, their “r” sounds are emphasized much more – they might say it like “Har-verd Yar-duh.” It’s important for speech recognition systems in healthcare networks to be able to understand accents in order to provide patients the best experience… but to what extent will AI keep up?
Automatic speech recognition systems (ASRs) are only as successful as what they’re trained to understand, and most aren’t trained to recognize accents beyond Standard American English The number of accents across the US is so high it’s impossible to count and that becomes even more problematic when people say proper names. For example, when calling to speak to “Doctor Andrea Kumar,” a Bostonian might ask for “Dahktah Ahndrayah Koomah,” while a Texan might say “Doctur Andreeuh Kummer.” Most ASR systems are poor at recognizing proper names because of a lack of training data.
While these systems are trained on thousands of hours of audio data, the frequency with which any particular proper name appears in that data is extremely low by comparison to the rest of the language.
There are at least 3 million different surnames in the US, and the potential pronunciation variations of each name makes recognition even more challenging. The stakes are high – health systems lose money when phone systems can’t understand what callers are saying. Time is wasted, people end up frustrated, and operators become overwhelmed with calls.
Speech recognition systems that can’t recognize accents prevent the business optimization that AI is meant to create. Those systems are not decreasing call volumes to operators, nor easing operational burdens for patient access centers.
When conversational AI technology is not surrounded by tools and applications that improve recognition of local pronunciations, ROI plummets.
Health systems must confront the challenges that accents pose, especially when dealing with proper names. This means implementing intelligent speech solutions that account for the many different accents in regional calling communities. Parlance employs a variety of techniques that produce superior name recognition at over a thousand hospitals and clinics across the nation.
In order to get maximum ROI for any speech-driven solution, it’s important to remember that improving performance requires more than just technology.
That’s why Parlance delivers every engagement as a fully managed service, continuously learning and adapting to each health system and its callers – including their accents.
Implementing voice-driven AI solutions that can better understand regional accents, dialects, and pronunciations requires both experience and adaptability. For over 25 years, Parlance has been leading the way, creating new technology and trends, making us the very best! Our proprietary applications and time-tested tools have evolved through many generations of speech technology. We ensure that it’s easy for callers to access the people, departments, and resources they need. We know what it takes to meet callers where they are, rather than at the limits of a given ASR engine’s capability.
At Parlance, we have always believed that callers deserve friction-free, voice-driven access to the right resources inside large organizations. This mission drives our continuous innovation and has guided us to pioneer the modernization of caller experiences.
By Annmarie Block