Answering Machine Detection Accuracy – Facts and Myths

I’ve discussed this topic so many times over the years that it reminds me of stories I would tell my kids only to have them say "yeah, Dad, we’ve heard this one before".  But one thing is for sure, there is a bunch of myth and misunderstanding in the world of outbound dialing around the topic of Answering Machine Detection (AMD) accuracy. 

An automated outbound dialer (sometimes referred to as a "predictive dialer" or just a "dialer") works to keep agents busy by placing multiple calls in parallel and analyzing each call to figure out what, if anything, answered.  Once a phone is answered, the 2 most common results are a live speaker and an answering machine.  The dialer uses call analysis to listen to the audio (the same way a human does) to figure out whether a real person answered (e.g. "Hello…") or a machine answered (e.g. "We are not home right now…").   Detecting a real person is called Live Speaker Detection (LSD).

Typically, a person answers the phone w/ a greeting followed by silence during which they are waiting for the person who called them to respond.  In a traditional automated dialer, this post-greeting silence is the indicator that there is a person on the other end and the call should be connected to an agent ASAP.  If it takes too long to connect the call to an agent, then the person called may say hello again or just hang up, realizing this is an automated system calling.  So there is a balancing act:

  • Decide early in the post-greeting silence that this is a person and route the call to an agent for a more rapid and natural greeting response.  Risk:  more answering machine messages that have gaps of silence in them are routed to agents, consuming agent resource.
  • Wait longer during the post-greeting silence to make sure this is a person and not just a longer-than-normal pause between words in an answering machine message.  Risk:  more people hang up because it takes too long for the agent to respond to the person’s greeting.

Myths (or, more accurately, misconceptions) around AMD accuracy are typically variations on the following:

  1. Dialer vendor ABC claims to have 98% AMD accuracy.  Can you be that accurate?
  2. Call center management wants NO answering machines going to the floor… only live speakers.  How can we do that?

In a perfect world, every machine sounds like a machine ("We’re not home right now, so please leave a message…") and every person sounds like a person ("Hello <pause>" with no background noise).  And in a perfect world, every dialer can perform 100% AMD accuracy – in other words, only live speakers are routed to agents and only machines are disconnected (or played messages). 

But, in case you haven’t noticed, this is not a perfect world.  Have you ever started carrying on a conversation with an answering machine where the message started "Hello?! <pause> We aren’t home right now…" – these especially bug me when I sense a grin in the person’s voice… "gotcha".  In our test bed of 22,000 recordings from around the world that we use to test and train our own call analysis, we throw a bunch of imperfect-world curve balls at our system, expecting it to mis-detect them (though recent advancements we have made are starting to catch even these red herrings – but that’s for another blog post). 

Detecting an answering machine as a person means that the answering machine goes to an agent.  That’s not quite as bad as the inverse:  detecting a live person as an answering machine and hanging up on the live speaker.  If you have ever called a list of friends to invite them to a party or give them a message, you have probably experienced something similar to what an automated dialer sees:  more answering machines than live people.  It’s not uncommon for a dialer to get 5 or 6 machines for each live speaker.  So, live speakers are less common than answering machines typically.

Imagine that a dialer places 100 calls, detects 98 as machines and hangs up on them, and detects 2 as people and routes them to agents, but the agents realize these are machines.  So the dialer appears to have been right on 98% of the answering machines – it only got 2 wrong – i.e. 98% AMD accuracy… or so it seems.  But what if 10 of those 98 "machines" detected by the dialer were really live people that just sounded to the dialer like machines?  Then the accuracy is really 88%.  So the dialer only routed 2 machines to agents, but it hung up on 10 real people instead of routing them to agents!  As we noted above, live speakers are less common than machines, and live speakers are the ones that buy things, promise to pay back debts, contribute funds to causes, etc.  So this apparent 98% AMD accuracy may not be so great after all.

What we have found over the years is that there is a balancing point between AMD accuracy and LSD accuracy.  AMD accuracy in the 60-70% range means that too many machines are getting routed to agents and wasting their time.  As we see above, 98% accuracy may mean that we are hanging up on live people in order to save agents from dealing with machines.  The better the AMD accuracy, the higher that balancing point will be – we see customers with that AMD accuracy balancing point in the range of 85 to 93% or higher.

To determine this balancing point for your outbound contact center, don’t get caught up on AMD accuracy.  Instead, adjust AMD accuracy up and down and monitor results.  Focus on maximizing your sales, donations, promises to pay, etc in a given timeframe.  Don’t get stuck on the number of answering machines going to agents – sometimes a few more machines going to agents means a few more people going to them also.  And remember, people buy, people donate, people promise to pay… most machines that I know of do not.

Matt Taylor