Conversation Intelligence

Intent Detection

By Vadim Kouznetsov, Founder of BubblyPhone · Last updated April 5, 2026

Intent detection is the classification of what a user is trying to accomplish from their utterance — mapping an arbitrary phrase like “I want to change my flight” to a predefined intent like reschedule_booking so a downstream system can act on it. For twenty years it was the core problem of natural language understanding. Then LLMs arrived and the field quietly reshaped itself.

Classical intent detection

In a traditional NLU system, an intent detector was a supervised classifier trained on labelled examples. The developer defined a fixed set of intents the system could recognise (usually between 20 and 200), collected or wrote example phrases for each, and trained a model to map new inputs to the closest intent. The model might be a shallow neural network, a support vector machine, or more recently a fine-tuned BERT variant.

Alongside the intent, the classifier usually extractedentities— the specific slots of information the intent needed. For reschedule_booking you would also extract the booking reference, the new date, and the traveller name. The combination of intent plus entities produced a structured representation the downstream system could act on.

Why it was hard

Intent detection looks easy on paper and is painful in practice for reasons most introductions skip over:

Overlapping intents.“I want to change my flight” and “I need to cancel my flight” are different intents. “I need to move my flight” could be either. Classifiers trained on small datasets confuse them.
Out-of-scope detection. Every classifier is forced to pick one of its known classes. A caller asking about something you did not anticipate still gets mapped to the nearest intent, which might be wildly wrong. Rejecting out-of-scope inputs requires a separate step.
Data hunger. Classical models need tens to hundreds of labelled examples per intent to perform well, and writing good training phrases is surprisingly hard. You never think of the ways real users will phrase things until you see them in production.
Domain shift.A classifier trained on one product’s call data performs worse when you deploy it to a new product, even if the intents are similar. Retraining is a constant cost.

What LLMs changed

Large language models collapsed most of the classical intent detection stack into a single prompt. Instead of training a classifier on labelled data, you describe your intents in natural language and ask the model to return one of them. The model works zero-shot for simple cases, few-shot for harder ones, and rarely needs training data at all.

More importantly, in an LLM-driven AI phone agent, explicit intent detection often disappears entirely. The model reads what the caller said and decides what to do directly, without a named intent as an intermediate step. If the caller wants to reschedule a flight, the model calls the reschedule_booking tool. If the caller wants something the model did not know about, it asks a clarifying question in plain English instead of returningout_of_scope.

Where intent detection still earns its keep

Despite the shift, explicit intent detection has not disappeared. It still does a better job in several specific contexts:

High-volume routing. If you need to route a million calls a day into one of 5 queues, running a small dedicated classifier is faster and cheaper than sending every call to a large LLM.
Deterministic compliance flows.When you absolutely have to know what the caller asked for — for legal, financial, or regulatory reasons — a classifier with a confidence threshold and a rejection path is more auditable than an LLM’s natural language decision.
Offline analytics. Classifying thousands of past calls by intent for trend analysis is cheaper with a small model than with an LLM pass on every transcript.
Privacy-constrained deployments. When calls cannot leave your own infrastructure, a locally-run classifier is practical where a large hosted LLM is not.