How Smart Will Chatbots Be in 2024?

I worked on interactive voice response systems in the 1990s. I remember the hype around speech recognition and the promise that we would be talking to our phones and they would answer all of our questions. Instead, we got "Press or say 1," which was no better than a touch-tone IVR. Twenty years later, the joke was that speech recognition has been five years away from the mainstream for 20 years. Alexa, Siri, and Hey Google are cool for keeping lists or quickly settling arguments about who was president of the United States 100 years ago. I love to check the weather on mine, but it hasn't changed my life yet, and I doubt your life has changed either. Conversations with computers are  just not there yet.

Contact centers are cost centers that need to manage the care and feeding of human agents. Humans are expensive. This is why IVR became such a big deal in the 1990s. I remember back then we used to tell our prospects that a call with an agent is going to cost $2.50, and a call handled buy an IVR will cost 25 cents. The numbers were averages, and no contact center is average, but when you are managing several million calls a year, that math makes for an easy business case. This is why there has been a lot of interest in chatbots, intelligent virtual agents, and conversational artificial intelligence in the contact center.

Enter ChatGPT.

And now we see generative AI, built of large language models (LLMs) and popularized by ChatGPT. If ChatGPT can write a college essay or a haiku for me, why can';t it replace my contact center agents? There are several reasons, but for the moment let's keep it to one: It lies. Google AI researchers coined the perfect term for this in 2018: hallucinations. A simple example of a lying large language model (LLLM) is how information on two people with the same name can be merged into a single entity (a hallucination if you will) with a truly amazing resume. Imagine merging the accomplishments of Brian Wilson, the former closer for the San Francisco Giants, and Brian Wilson, the creative force from the Beach Boys; now there is a resume! ChatGPT could deliver that with complete authority and absolutely no idea that it was incorrect.

The magic behind ChatGPT is its insanely large LLM. Estimates are that ChatGPT has around a trillion parameters. So, why is an LLM helpful to a chatbot? How does an LLM help conversational AI? The answer is a lot. Today, if you want to program a chatbot, you need to anticipate what people will say and train your chatbot explicitly how to understand what is being asked (the intent) and all of the different ways that this question can be asked. Then you need to explicitly answer those questions with the appropriate information. With LLMs, the system just knows so much. My favorite example is from an insurance application demo where the person making a claim on a stolen cellphone explained that his car window was smashed, and his phone was taken from the glove box. Without any explicit training, the LLM knew that something in a glove box is not in plain sight. This matters if the insurance company won't pay for a phone stolen from plain sight. It also matters if you don't need to waste the customer's time asking the plain-sight question for something stolen from a glove box. This is a great example of computers starting to understand all of the sorts of things that we humans automatically connect in our heads. Think of all the step-by-step work that goes into replicating that explicitly in a computer program. Now think about a program that just does that for you. If we can harness that level of intelligence, the work it will take to create a powerful, helpful, and very useful chatbot will be much less than what we've dared to imagine to date.

Don't get me wrong, we are NOT there; but the potential to deliver this is tantalizingly close. Or is it? Remember, speech recognition was five years away from living up to its potential for over 15 years.

So, what can we expect?

I just poked around a few sources internally at Forrester and with some of the vendors with whom I work in the space, and I can't currently find any use of ChatGPT that is actually in production or even beta. There are lots of theories and some cool demos out there, but nothing I can find in the real world. So, when will we see something?>

There are some things that will likely start showing up in the next few months:

  • Using ChatGPT to help agents with answers is already here, in some cases formally, and in other cases just from agents using it informally on their own. No great software investment is required to ask ChatGPT a question and fine tune the answer before sharing.
  • Chatbot dialogue design. Generative AI can design chatbot flows that can't be just let into the wild, but this can reduce much of the drudge programming required to build a flow.
  • Human-reviewed answers are something I expect to see this year. Let ChatGPT generate an answer, just make sure someone proofs it before it goes out. An expensive solution, but it will find a niche.

And what about 2025?

Expect to see some mixed application capabilities, using generative AI for general context and information and a more traditional AI to deliver transactional and company-specific information that is currently beyond the reach of generative AI systems.

The long-term use cases are exciting and fun, but you can bet some companies will move too fast and there will be some large scandals next year involving accidentally shared customer data or inappropriate or incorrect responses with huge consequences. Fasten your safety belts; it's gonna be a wild ride.


Max Ball is a principal analyst at Forrester Research.