Dialer Call Progress Detection | Auto Dialer

Issued related to this article:

There is a delay before the audio message is played
The system does not leave a message on answering machine
The message on answering machine is being cutoff
No popup screen for live answer for some predictive calls

When an outbound call is made, Voicent auto dialer software has to detect whether the call is answered by a live human or an answering machine (or voicemail). This process is called ‘call progress detection’ and it is based on analyzing the incoming audio. Traditionally (when computers are slow) the detection is implemented in hardware, and due to the limitations of hardware implementation, the accuracy is not that great. Even today, no system can achieve 100% accuracy. The best system for prediction is the human perception system. But even humans make mistakes from time to time.

Voicent has spent a lot of development time to improve the accuracy. Our detection algorithm is based on intelligently analyzing the incoming audio and apply the best fit statistical patterns based on a large amount of data collected. Since our products has been on the market for years, Voicent software is battle tested in real world and the accuracy is second to none.

The Most Important Factor Is Clarity Of Phone Audio

In order for call progress detection to be accurate, the most important factor is the clarity of the phone audio. If the phone audio is clear, the normal delay is about 1-2 second. To get better phone audio, you need:

1. A reliable phone service
2. Enough internet connection bandwidth (128K bps for both up stream and down stream per line)
3. A good connection between your computer and the Internet router. Avoid using wireless connection if possible

A Useful Diagnostic Tool

You can use the Record Initial Dialing to get the audio stream used by the detection software. This audio is basically what the system hears from the phone line.

What Else Can Affect the Call Progress Detection

When a phone call is answered, an auto dialer system needs to play a message a) immediately for live human pickup, and b) until a beep sound is detected for answering machine or voice mail system. Based on the answering party, the system then decides whether to play or wait. Unfortunately, there is no tone or signal over the phone network that indicates the pickup situation. Hence the system has to analyze the audio stream over the phone line in order to make a decision.

Strong background noise can be hard to filter out, especially it is from a human voice. A loud TV in the background, a second person talking to other person, etc, can mislead the system to make an answering machine prediction for human pickup. On the other hand, a very weak answering machine volume can be diagnosed as background noise. So the system might think it is a live human pickup and play the message too early.

Another factor is that people answer the phone differently. There are really no fixed patterns or even regular patterns to follow. People from different countries, from different ethnic groups answer the phone differently. The system is tuned for north american population where people usually answer with a short “hello”.

Answering machine messages and voicemail prompts are very different too. Phone company messages are also complicated. Some announcements start with a disconnected beep sound, others with a different beep, and the rest do not even have one.

You forget to say hello

The system is voice activated. If you do not say a word, the system will spend more time wait for human voice.

Special Situation Handling

In order to make the system more responsive to human pickup, something has to give. And this something is the accuracy for answering machine detection. If you care more about live human pickup and you can tolerate more mistakes for answering machine, then you can make the algorithm more responsive to humans.

Make it most responsive for live human pickup

On an extreme case, you can instruct the system not to do any answering machine vs. human analysis. Whenever the system hears a voice, it can start playing the message right away. Of cause, the system still needs some short time to recognize the human voice, filtering out the background noises, etc. The drawback is of this approach is that all answering machines will be treated as humans. The message will be played immediately after an answering machine answers the call. There will be no message, or only partial message, left on answering machines.

To set this option, please select Setup > Options… from the gateway main menu, then choose the Detection tab. Move the sliding control bar to the position marked “Most aggressive”.

The biggest problem of this approach is that no answering machine will be recognized. For general usage, this setting is not recommended.

You can make the system to be more aggressive on human pickup without totally losing the ability of answering machine detection. You can set it prediction to be “more aggressive” on humans. In this setting, the system will try to make a prediction as soon as the audio is likely to be humans. But there will be more mistakes for answering machines.

This article is written for customers using VOIP service (such as callcentric, skype, PBX SIP extensions) for making calls. Customers using analog lines should take a look at older posts.