But… Can Your Conversational AI Solution Recognize a Proper Name?
As advanced as most automatic speech recognition (ASR) systems are, they still have difficulty handling and recognizing proper names. The sheer volume of names in any given language presents an issue from the outset. Most English-speaking people have an average vocabulary of 20,000 to 35,000 words. However, there are more than 150,000 surnames in common usage in the U.S. alone, and nearly a million names in total. This leads to proper names outnumbering regular words 10 to 1. So, while today’s ASR systems are trained on hundreds of thousands of hours of audio, the frequency of proper names compared to more commonly used words is still very low.
All these factors (limited training data, training biases, and heterogeneity) can lead to poor performance of an AI solution. For example, if you said, “Wally Shaw, please” the system might transcribe this as “Wallace Shawn, please” and then act on this, potentially compromising private health information (PHI). Similarly, doctor’s names can be misinterpreted or confused (recognizing “Mike Daly” as “my daily”, for example), and this could result in frustration when the system doesn’t understand and/or takes you to an operator that leads to a queue.
In healthcare, it is essential that ASR systems minimize these kinds of mistakes, but the speech recognition systems used by most healthcare institutions still regularly make them.
Parlance has spent over 25 years building a set of proprietary tools and technologies to generate better performance from ASR and NLP systems to eliminate these problems.
By Will Sadkin