What is Speech Analytics? Call Center and Business Applications

What is Speech Analytics?

Speech analytics is at the forefront of the corporate push to make intelligence gained from Big Data not only valuable but actionable in real-time. Speech analytics offers the ability to create meaningful voice data and interaction trends to help companies improve services, reduce costs, and grow revenue in their contact center and other business areas.

Originally called audio-mining, in which audio files were converted to text to enable searches of specific words or phrases, speech analytics now involves in-depth searches based on phonetics with the ability to detect certain emotions expressed on a phone call as well as trends within a call, such as hold times, silent patches, or agents talking over a caller. Old audio-mining techniques offered matching accuracy rates of around 50 percent. Current speech analytics technology boasts accuracy significantly greater than 80 to 90 percent. With improved accuracy, speech analytics have been working diligently to improve the speed at which results are delivered. Intelligence can be provided in near real time to the business decision makers. As a result of improved technology and capabilities, speech analytics is beginning to mature but is still in the early adoption phases within the call center market.

The Market for Speech Analytics

The speech analytics market has been growing at a strong clip since it came on the scene in 2004. Barriers to growth have included low customer awareness and a lack of understanding about quantifiable return on investment (ROI). Speech analytics vendors now are targeting different segments, including small- to mid-market centers and are providing more customized solutions to boost adoption rates. In addition, hosted or SaaS solutions can significantly reduce the initial investment and financial commitment to deploying a speech analytics product.

Significant growth is expected throughout the near future. In fact, speech analytics implementations in contact centers have increased from 200 in 2005 to 1,200 in 2007 to 3,600 in 2012 while the number of seats has grown from 176,825 in 2006 to 2.3 million through 2013. Growth of 18 percent is expected from 2014 through 2016. The advent of real-time speech analytics solutions is helping to drive this growth, especially in the healthcare and account collections segments.

Despite these market developments, the flagship segment for these systems continues to be in the security, law enforcement, and intelligence gathering communities. In the traditional call center market, only 3.6 percent of all call center seats have the benefit of speech analytics packages. But the ability of speech analytics to convert unstructured data - which represents about 90 percent of all enterprise data - to structured metadata and the capability to work throughout the organization make speech analytics a viable investment.

With the ability to increase revenue and customer loyalty and to provide direct and relevant feedback to other areas of the enterprise - coupled with the current boom in the Big Data segment - investments in speech analytics are expected to expand by nearly 30 percent each of the next three years. Now that service providers are offering real-time speech analytics solutions, the interest that companies have is increasing because they can impact the outcome of a customer interaction that is occurring in that moment. As with any technology implementation, however, caveats exist. Processes, good management practices, and other technologies must already be in place to ensure successful deployment.

Some firms that have implemented speech analytics specifically in the contact center are touting return on investment within seven to nine months; however, those organizations that take their time to plan for their speech analytics solution to impact the whole enterprise are boasting investment returns within four months. If that is indeed true, it bodes well for the industry. Early adopters, however, tend to put a lot of resources into ensuring that their investment is beneficial by applying it to gain an advantage over their competition. Later adopters are often more lax in how they train managers and supervisors to apply the technology, and may even neglect to commit dedicated resources to leverage the technology to improve processes and quality delivery.

History of Speech Analytics

Originally, speech analytics was used by government organizations to track the use of key words or phrases to help identify security risks or threats by individuals or entities under surveillance. The earliest versions of this technology were quite simple and known as audio-mining or word spotting. Audio-mining applications indexed the speech from an audio or video file by processing it through a large vocabulary recognizer and by converting it into searchable text files. The words or key phrases were predefined, and an operator was notified only if a match existed. Accuracy rates were generally less than 50 percent. Accuracy rates using speech-to-text systems have increased dramatically in recent years, however.

As technology improved, organizations demanded search capabilities based on phonetics to improve audio-mining accuracy. With this phonetics-based method, an index of phonetic content, as opposed to the word content requiring letter-for-letter matches, is created. As a result, the search has only to match speech that sounds like the predefined key words or phrases. Phonetic searches offer the flexibility of mining words, phrases, or proper names that are not already listed in the dictionary database. Phonetics-based audio mining tends to deliver results with accuracy from 80 percent up to 98 percent.

In the past several years, call centers have become interested in these technologies, although about 43 percent of organizations do not yet know what speech analytics really is or have any idea how it can benefit their business. Many centers have struggled to develop consistent, robust monitoring processes that include formal feedback and coaching sessions with agents. One of the most common reasons for poor performance in this area is because the demand on supervisor time is too great in other operational areas. Some call center managers began to view audio-mining as a way to better use the limited time resources of their supervisors by having the technology identify which calls should be monitored, such as calls in which a reservation is booked, a complaint is lodged, or a cancellation is requested. Until recently, however, these solutions had not taken hold in the call center market. Enter speech analytics.

Evolution of Speech Analytics Technologies

Speech analytics includes the audio-mining technologies described above but typically refers to a broader range of speech products, such as speaker identification, emotion detection, and talk analysis. Speaker identification in combination with audio-mining highlights specific items call center managers are trying to focus on, such as anytime a customer tries to cancel a reservation, close an account, or file a complaint; when an agent does not offer a greeting or close to the call; or when an agent neglects to cite required phrases for legal compliance purposes.

Emotion detection can alert a supervisor or manager if a customer begins to get upset or agitated with the agent. Talk analysis can identify patterns within calls, such as long hold times or periods of silence, as well as the frequency of an agent cutting off a caller. Speech analytics can be used to research positive trends as well, such as when an agent presents a new program effectively or a caller is thrilled with the service they received. For example, Federal Express launched a program to identify instances in which customers provided a "wow" response to the service they received.

The use of emotion detection in speech analytics tools has been extended to deliver analysis to agents in real time. This new capability enables agents, supervisors, and quality specialists to get a live analysis of the choice of words or phrases the customer uses on a call, alerting them to a growing client sense of irritation, desperation, anger, and other emotions.

Traditional speech analytics solutions allow organizations to search for calls by keyword, phrase, or business category, helping users find relevant conversations quickly to determine the underlying causes of rising call volumes, costs, and customer dissatisfaction. A new generation of speech analytics can now help automatically identify changes in customer behavior using technology such as Customer Behavior Indicators. These next-generation solutions can proactively index every single word and phrase and can create a baseline of all dialogs that occur within customer interactions. This capability automatically surfaces the increases/decreases in the use of terms and phrases that may reflect a new potential trend, without the need to predefine terms in advance.

In conjunction with other call center technologies, such as Integrated Voice Response (IVR) systems, speech analysis tools can help classify call types by determining the root cause of the call, which can identify trends not readily apparent to supervisors performing simple random call monitoring. Management analyses and response to developing trends promote tactical changes to reduce calls and customer complaints in cases of defective products or drive revenue when competitors are sold out or have increased prices.

Increasingly, speech analytics is being deployed to share the structured data derived from raw, unstructured customer interactions with various business disciplines, such as marketing, sales, product development, and manufacturing, to refine the approach to rectifying core business issues and for targeting key product improvements. Executives can examine the 80 percent of complaints that come into call centers that are not related to agent performance to facilitate strategic planning processes. As more companies pursue Big Data solutions, they will turn to analytics to get a handle on the meaning and trends in all the data they collect. Speech analytics is expected to play a large role in the Big Data boom.

Speech analytics solutions are now going beyond call data, delving into interactions across multiple channels. Customer contact via email, text, online chat, Skype, Twitter, Facebook, and other social media sites can now be cataloged and analyzed to provide meaningful and actionable intelligence to the business.

Speech analytics solutions have been purported to aid in many business functions, chief among them being root-cause analysis. Other activities supported by speech analytics today include the following:

  • Call classification and trend analysis
  • Customer experience risk analysis
  • Quality assurance automation
  • First-call resolution
  • Call volume reduction
  • Self-service IVR utilization improvement
  • Training needs analysis
  • New product and feature development
  • Consistent product branding and brand management
  • Incremental sales growth
  • Customer retention
  • Collections improvements
  • Identifying operational deficiencies
  • Fraud detection
  • Regulatory or script compliance validation
  • Competitor information gathering

Speech Analytics Vendors

Below is a look at some of the main competitors with speech analytics solutions. Some well-known organizations claim to offer speech analytics; however, they simply represent or repackage solutions by one of the providers below. In addition, many players are now offering hosted, managed services, or Software-as-a-Service (SaaS) solutions, including Uptivity (CallCopy), Castel, and Avaya Aurix. Again, most of these allow you to tap into their speech analytics engine powered by one of the following platforms.


CallMiner, based in Fort Myers, Florida, offers its CallMiner Eureka! Suite, which includes the Search, Report, and Analyze modules in an on-premise or cloud-based solution format. The platform analyzes words used to categorize each call by reason, product, competitor, and other items; measures acoustics like call duration, silence, noise, and stress; creates key performance indicators and statistical indices; and integrates with other call center technologies. CallMiner has a managed service offering for organizations looking for a low initial cost investment that provides access to the same speech analytics capabilities. Currently, the Eureka suite is offered as an open architecture and integration layer package supporting all levels of call centers. CallMiner provides the integration of speech analytics with other business data, generates reports, provides searching tools for key terms, and customizes call reports to meet customers' specific needs.

In 2010 and 2011, CallMiner has added functionality that enables Eureka to operate in a cloud environment and to redact sensitive data for confidentiality and full PCI compliance. With the Eureka 8 launch in late 2011, CallMiner added a fully automated quality monitoring solution called AutoQM. In July 2012, CallMiner launched the 9.0 version, adding considerably to their user-personalized portal functionality. The new EurekaLive solution provides real-time call center quality assurance and can be customized for businesses in various industries. CallMiner boasts the Cleveland Clinic, GrubHub, Yodle, Google, Carmax, Citrix, Vodafone, Amazon, British Gas, and more other customers and strategic partners.

Autonomy etalk

etalk, acquired by search leader Autonomy, entered the speech analytics market with its Qfiniti Explore solution. Qfiniti Explore includes trend spotting, real-time alerts, and ad hoc reporting and searching capabilities. It also touts its Intelligent Data Operating Layer (IDOL) from parent company Autonomy to develop a conceptual understanding of data to refine the matching process by identifying patterns using a patented algorithm. Qfiniti Explore rounds out etalk's suite of performance management, call recording, and e-learning products. In 2010, Explore was overhauled to incorporate multi-channel analytics, enabling companies to access and analyze a comprehensive range of customer interactions, from voice and email to Web and social media to wiki entries, survey responses, internal CRM records, and even storefront transactions through its more than 400 connectors to internal, external, and public data sources. etalk's patented pattern-matching technology enables businesses to retrieve interactions based on concept and context searches in addition to the traditional phoneme-based lookups.

As of mid-2010, Autonomy customers included France Telecom, Boeing, AT&T, GlaxoSmithKline, Intel, Sprint, and Canon USA. In June 2010, Autonomy announced that it would acquire CA Technologies Information Governance business, based in Islandia, New York. In early June 2012, Autonomy announced the ability to use the IDOL framework in the cloud. Autonomy has also recently released the HP ExploreCloud solution to enable customer to operate the analytics suite in the cloud.


Nexidia, headquartered in Atlanta, Georgia, and founded in 2000 based on the research conducted at Georgia Tech, offers their Interaction Analytics product, which uses phonetics-based search capabilities to help call centers and government agencies categorize, mine, and analyze calls, surveys, email, and chats in 30 languages. It provides drill-down reporting functionality as well as business intelligence tools such as quality monitoring, compliance, dashboarding, and performance analytics. In February 2007, Nexidia announced that they had improved audio file indexing speeds by 24 percent and search speeds by more than 200 percent. Private investors in Nexidia include Morgan Stanley, BlueCross BlueShield Venture Partners, HIG Ventures, Paladin Capital, and Sandbox Industries. In 2008, Nexidia OnDemand was launched to provide speech analytics offerings using the SaaS format.

In May 2010, Nexidia announced that industry research and consulting firm Frost & Sullivan named it as the recipient of the 2010 North America New Product Innovation Award in the Speech Analytics category. Frost & Sullivan evaluated the company's speech analytics solution and its supporting components, and the award was presented based on evidence of surpassing the competition in five areas of innovation. In 2014, the Nexidia QC 2.1 product earned NewBay Media’s Best of Show Award. Some of Nexidia's high-profile clientele include the SEC, Verizon, Ventura, and Time Warner Cable. Nexidia holds more than a dozen patents in the speech analytics arena.

NICE Systems

NICE Systems, based in Israel, combines audio-mining, emotion detection, and talk analysis with text analysis into its NICE Cross-Channel Interaction Analytics platform to analyze customer interactions from the phone, email, chat, social media, and Web. The platform can provide contact center management with a unified view of speech and other channel communications, call flows, surveys, and agent desktop activity. NICE comes from a heritage of providing recording compliance to organizations and security solutions to government agencies, such as analyzing audio, video, and web content, to proactively identify security threats. NICE has tailored its solutions to benefit the call center industry over the past several years, as evidenced by its recent acquisitions of IEX, a workforce management solutions provider, and Performix Technologies, a leader in the performance management segment of the call center market. The NICE Perform solution has been deployed at key clients such as Federal Express. NICE Systems finished fiscal year 2013 with more than $949 million in revenue, up nearly 8 percent from fiscal year 2012. They recently added SaaS-based and hosted speech analytics offerings to their product and service suite. 

NICE Systems is the worldwide leader in speech analytics with an estimated 29 percent market share. During an intensive third-party review, NICE received perfect scores in customer satisfaction in the innovation and speech analytics categories as well as the top performance rating for ease of configuration, flexibility, root cause analysis, and discovery. NICE announced in June 2010 plans to acquire eglue, a leading provider of real-time decisioning and guidance solutions, for approximately $29 million. The combination of eglue's solutions and NICE's SmartCenter suite of intent-based solutions - now marketed as the Real Time Guidance product - will allow contact centers to turn data from customer interactions into real-time business impact by providing agents with a next-best action recommendation during a call or chat session based on the real-time analytics of the interaction that had occurred to that point.

In August of 2013, NICE initiated the acquisition of Causata, a provider of real-time Big Data analytics technologies. This acquisition will help NICE to incorporate real-time analytics into their product offering.

Genesys Labs

Genesys Labs, a leader in the workforce optimization space, acquired UTOPY in 2013. UTOPY offered on of the leading speech mining and analytics solutions as part of the SpeechMiner suite, which allowed users to process and index audio files, translate and categorize speech data, conduct ad hoc and drill-down reporting to better analyze event information, and segment data into user-defined categories based on business need. SpeechMiner also offered a customizable workflow feature that establishes executive-level dashboards for core initiatives.

Genesys Labs has rolled the SpeechMiner capabilities into the workforce optimization suite. They are marketing it now as their patented “Speech-to-Phrase” Recognition engine.

Verint Systems

Based in Melville, New York, Verint Systems offers a host of contact center and performance management solutions, including its speech analytics products Impact 360 Speech Analytics and Impact 360 Speech Analytics Essentials. Impact 360 Speech Analytics is a full suite of call mining, analysis, and reporting solutions. The Essentials product includes basic speech analytics capabilities needed by and targeted to smaller call center operations. Verint also offers the Impact 360 Advanced Speech Analytics solution for enterprise customers. These offerings combine the historical expertise of both Verint and Witness Systems, whom Verint acquired in 2007. In 2011, Verint acquired Global Management Technology (GMT) Corporation. In 2013, Verint bought out majority owner Comverse Technology. In 2014, Verint acquired KANA Software, a provider of on-premise and cloud-based customer engagement optimization solutions.

More than 10,000 organizations in more than 150 countries, including more than 85 percent of the Fortune 100, use Verint solutions to capture, distill, and analyze complex and underused information sources, such as voice, video, and unstructured text. For small and medium companies, Verint offers the Impact 360 Speech Analytics Essentials product, an out-of-the-box solution.

Best Practices for Speech Analytics Implementation

Before selecting and deploying a speech analytics solution, a call center manager needs to assess, and perhaps modify, current processes. For example, does a mechanism exist today that provides a flow of critical information and customer feedback from the call center to other operating areas such as marketing, engineering, or executive management? What type of call monitoring process exists, and how effectively is coaching provided to each agent? What information is vital to supporting clients and improving operations? How will new information be used throughout the organization? Answering these and similar questions can help an organization better prepare for a successful speech analytics deployment.

In addition, if a call recording solution is already in place, managers must know whether the quality of the recordings is good enough to index and search and whether the speech analytics solution is compatible with the recorder. When interviewing potential vendors, evaluate how compatible a solution is with the entire technological environment, whether the product is flexible and easy to use, and whether changes can be made without having to hire a technical team.

Call center managers should consider current and future operations when implementing speech analytics. Solutions should be scalable to meet changing needs. For example, ensure that search terms, phrases, or categories are not limited; verify that the solution supports multiple languages; and review standard and ad hoc reporting capabilities to validate that they meet your growing business requirements. Some vendors are resellers of speech analytics solutions. If a manager prefers purchasing a system directly from a manufacturer, some research must be conducted to know who the players are, which directs them to check references with partners and with other organizations that have deployed the solution.

Finally, call center managers must consider how they will staff the speech analytics team. Speech analytics is not a plug-and-play system. It inherently requires customization and solution must be "trained" to meet your operational needs. In addition, dedicated resources are required to review and take action on the intelligence these solutions provide. Without those dedicated resources, companies are playing fire with the ability for the solution to deliver the intended return on investment.

Web Links

Autonomy etalk: http://www.autonomy.com/products/qfiniti3
Avaya Aurix: http://www.avaya.com/usa/product/speech-analytics/
CallMiner, Inc.: http://www.callminer.com/
Castel Communications: http://www.castel.com/

CRM Magazine: http://www.destinationCRM.com/
Genesys: http://www.genesyslab.com/
Nexidia, Inc.: http://www.nexidia.com/
NICE Systems: http://www.nice.com/
Smart Customer Service: http://smartcustomerservice.com/

Speech Technology magazine: http://www.speechtechmag.com/

Uptivity, Inc.: http://www.uptivity.com/
Verint Systems, Inc.: http://www.verint.com/

About the Author

Sheree Van Vreede is a technical resume writer and brand strategist with ITtechExec and an independent consultant who has worked with the IEEE in the fields of telecommunications, information technology, and various scientific- and engineering-related issues. She works with the IEEE Standards as well, helping to publish guidelines for various technological processes. Sheree is a regular contributor to Faulkner Information Services.

This article is based on a comprehensive report published by Faulkner Information Services, a division of Information Today, Inc., that provides a wide range of reports in the IT, telecommunications, and security fields. For more information, visit www.faulkner.com and www.infotoday.com.

Copyright 2014, Faulkner Information Services. All Rights Reserved.