Solving LLM Hallucinations In Cases Of Conversational Use And Intended For Customers

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

Or: why “can we deactivate generation” could be the most intelligent question of generative AI

Not long ago, I found myself during a meeting with technical leaders of a large company. We discussed speaking as a solution to build common but closely controlled conversational agents. The conversation was fine – until someone asked a question that completely caught me:

“Can we use speaking while deactivating the generation part?”

At first, I honestly thought it was a misunderstanding. A generative AI agent… without generation? It seemed paradoxical.

But I stopped. And the more I considered it, the more the question started to have meaning.

High challenges of customer oriented AI

These teams did not play with demos. Their AI agents were intended for production, intervening directly with millions of users per month. In this type of environment, even an error rate of 0.01% is not acceptable. One in ten thousand interaction is one of too much when the result could be failure to comply, legal risk or brand damage.

On this scale, “fairly well” is not good enough. And while the LLMs have gone a long way, their generation of free form always introduces uncertainty: hallucations, involuntary tone and factual drift.

So no, the question was not absurd. It was actually crucial.

A change of perspective

Later in the evening, I think about it. The question made more sense than I had done at the start, because these organizations did not lack resources or expertise.

In fact, they had full -time conversation designers on staff. These are professionals trained in the design of agent behavior, the development of interactions and the writing of responses that align perfectly with the brand's voice and the legal requirements, and bring customers to really engage with AI – which turns out to be an easy task in practice!

So, they did not ask to deactivate the generation out of fear – they asked to turn it off because they wanted – and were able to take control of their own hands.

It was then that it struck me: we have poorly developed what the “generative agents of AI” really are.

They are not necessarily a question of generation of open token. It is a question of being adaptive: responding to entries in context, with intelligence. Whether these responses come directly, token to be chips, from an LLM or an organized response bank, whatever. What matters is whether they are appropriate: compliant, contextual, clear and useful.

The hidden key of the hallucination problem

Everyone is looking for a solution to hallucinations. Here is a radical thought: we think it's already there.

Conversation designers.

Having designers of conversation in your team – as many companies already do – you are not only attenuating the output hallucinations, you are actually ready to eliminate them completely.

They also provide clarity in customer interaction. Intentionality. A committing voice. And, they create more effective interactions than the foundation than the LLMS can not, because the LLM (only) still do not ring quite in the scenarios intended for customers.

So, instead of trying to modernize generative systems with dressings, I realized: why not cook that in speaking from zero? After all, speaking is a question of authority and design control. It is a question of giving the right people the tools to shape how AI behaves in the world. It was a perfect correspondence, in particular for these cases of use of the company which had so much to gain adaptive conversations, if only they could trust them with real customers.

From insight to the product: correspondence of statements

It was the moment of breakthrough that led us to build Statement models Speaking.

The statement models allow designers to provide liquid models and context aware for agents' responses: the responses that seem natural but which are entirely verified, paid and governed. It is a structured way of maintaining the LLM type adaptability while keeping a catch on what is really said.

Under the hood, the declarations models work in a 3 -step process:

The agent writes a fluid message based on the current consciousness of the situation (interaction, directives, tool results, etc.)
Based on the message project, it corresponds to the closest statement model found in your statement store
The engine makes the model of statement paired (which is in Jinja2 format), using variable substitutions provided by the tool, if applicable

We immediately knew that it would work perfectly with the hybrid speaking model: the one who gives software developers the tools to build reliable agents, while allowing commercial and interaction experts to define how these agents behave. And guys from this particular company immediately knew that it would work too.

Solving LLM hallucinations in cases of conversational use and intended for customers

Conclusion: Empower the right people

The future of conversational AI is not to remove people from the loop. It is a question of empowering the right people to shape and continuously improve what AI says and how it says it.

With speaking, the answer can be: people who know your brand, customers and responsibilities.

And so the only thing that turned out to be absurd was my initial response. Disable – or at least strongly control – in customer -oriented interactions: it was not absurd. This is probably how it should be. At least in our opinion!

Non-liability clause: The opinions and opinions expressed in this invited article are those of the author and do not necessarily reflect the policy or the official position of Marktechpost.

Yam Marcovitz is the technological manager of Parler and CEO at Emcie. An experienced software manufacturer with vast experience in the architecture of Critical Software and Systems, the background of Yam informs its distinctive approach to the development of controllable, predictable and aligned AI systems.