In my last blog, Understanding Speech Recognition Limitations, I shared some inherent limitations of Automatic Speech Recognition (ASR). While these limitations exist, in this blog, I’d like to focus on how companies can maximize the performance of ASR. The following are two best practices that I’ve found to be especially impactful.
Best Practice #1: Include usability testing in your project plan. Once you feel your application is tested and ready for production, work with non-project team members to test it and provide feedback. If possible, and even better, involve your customers. Provide them with scenarios and test data. Solicit their feedback in areas such as simplicity, pace, clarity of self-direction, recognition (they may be responding with something you didn’t anticipate) etc. The feedback you receive from usability testing is extremely valuable and beats learning about caller challenges after your application is in production.
Best Practice #2: Introduce ongoing application tuning. Prior to speech applications, touch-tone, or DTMF (dual tone multi frequency), interactive voice response (IVR) applications were largely designed, developed and implemented just once. Once trained, an electronic agent needed little to no follow-up training. Not so in a speech environment.
Speech applications require periodic tuning and analysis. Initial tunings will yield numerous opportunities to improve performance, and subsequent tunings will continue to produce opportunities. I believe two to three tunings in the first year of a new speech implementation is appropriate. For more mature applications, one to two per year will suffice.
If done well, a good speech application can deliver an improved caller experience, while delivering savings to your bottom line. I invite you to comment below and/or share your own best practices for deploying speech recognition.