Beyond single-label classification
The first thing most teams do when they add AI to their support workflow is classify ticket type. A single call, a single label: billing, technical_bug, account_access. Ticket goes to the right queue. Done.
That's genuinely useful. But it's only one dimension of the information sitting inside every inbound ticket. The same message that tells you what the ticket is about also tells you how urgent it is, how the customer is feeling, and sometimes whether you're about to lose them. A single classifier misses all of that.
Stacking classification layers means running multiple focused classifiers against the same ticket and combining the results into a complete routing and prioritization decision. Each layer answers one specific question. Together, they give you a structured picture of every ticket that no human triage process can match for consistency or speed.
The three core layers
Most support teams need three classifiers at minimum. Each one is a separate schema - narrow, well-defined categories that answer exactly one question about the ticket.
Layer 1: Ticket type
This is the routing layer. It determines which team or queue the ticket belongs in. Categories should map directly to a destination:
{
"name": "ticket_type",
"categories": [
"billing",
"technical_bug",
"account_access",
"feature_request",
"onboarding",
"general_inquiry"
]
}
Keep these categories mutually exclusive. If a ticket could reasonably be technical_bug or account_access, you'll get low confidence scores and inconsistent routing. If that overlap happens frequently, add a more specific category that captures the intersection - or let the confidence threshold catch the edge cases and send them to a human.
Layer 2: Urgency
Urgency is a completely separate dimension from type. A billing ticket can be critical. A feature request is almost never urgent. Running them as a single combined classifier forces compromises in both. Keep them separate:
{
"name": "ticket_urgency",
"categories": [
"critical",
"high",
"normal",
"low"
]
}
A few signals the urgency classifier picks up well: mentions of outages or data loss (critical), phrases like "blocking our whole team" or "I have a call in an hour" (high), general frustration without time pressure (normal), and speculative questions or enhancement ideas (low).
The output maps directly to your help desk's priority field. No human judgement call required.
Layer 3: Customer sentiment
Sentiment tells you how the customer feels, independent of what the problem is or how urgent it is. A customer who is frustrated on a normal priority ticket still needs a faster, more careful response than a neutral customer on a high priority one:
{
"name": "customer_sentiment",
"categories": [
"frustrated",
"neutral",
"satisfied",
"at_risk_of_churn"
]
}
at_risk_of_churn is worth calling out separately from frustrated. Churn risk signals are specific - explicit mentions of cancellation, comparisons to a competitor, or phrases like "I'm reconsidering whether this is the right tool." That's different from a customer who's annoyed but clearly expects to stay. Routing these to a senior agent or customer success immediately can meaningfully reduce churn.
Running all three efficiently
The naive approach is three sequential API calls. That works, but you're adding latency unnecessarily. The better approach is a single batch call that runs all three classifiers against the same input at once:
curl -X POST https://api.classifaily.com/v1/batch \
-H "Authorization: Bearer cai_live_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"items": [
{
"input": "Hi, I have been locked out of my account for two days. The reset link just spins. I have a client presentation tomorrow and I need access now. This is unacceptable.",
"type": "text",
"schema_id": "ticket_type"
},
{
"input": "Hi, I have been locked out of my account for two days. The reset link just spins. I have a client presentation tomorrow and I need access now. This is unacceptable.",
"type": "text",
"schema_id": "ticket_urgency"
},
{
"input": "Hi, I have been locked out of my account for two days. The reset link just spins. I have a client presentation tomorrow and I need access now. This is unacceptable.",
"type": "text",
"schema_id": "customer_sentiment"
}
]
}'
Response:
{
"results": [
{ "label": "account_access", "confidence": 0.95 },
{ "label": "critical", "confidence": 0.91 },
{ "label": "frustrated", "confidence": 0.88 }
]
}
One call. Three structured answers. The ticket routes to the account access queue, gets flagged as critical priority, and the assigned agent sees a sentiment tag before they open it. All of that happened before any human read a word of the ticket.
Building routing logic from multiple labels
With three labels in hand, your automation can make much more nuanced decisions than a single-label system allows. A few useful patterns:
Priority escalation on combined signals. A normal urgency ticket from a frustrated customer might still warrant bumping to high in your help desk. You can apply that rule in your automation layer without any model change - the labels give you the raw signals, the logic is yours.
Churn risk fast-track. Any ticket with at_risk_of_churn sentiment, regardless of type or urgency, triggers an immediate Slack alert to customer success and skips the normal queue. The type and urgency labels still route it correctly - the sentiment layer just adds an additional action on top.
Auto-response scoping. general_inquiry tickets with neutral sentiment and low urgency are strong candidates for an automated first response linking to documentation. technical_bug tickets with frustrated sentiment should never get an auto-response - a human reply is warranted even if resolution takes time.
SLA assignment. Map the urgency label directly to SLA tiers. critical gets a 1-hour response SLA. high gets 4 hours. normal gets the standard next-business-day. No manual SLA assignment needed.
Advanced layers worth adding
Once the three core layers are running cleanly, two additional classifiers tend to deliver high value with minimal setup cost.
Self-service resolution potential
Some tickets are asking a question that your documentation already answers. Classifying for self-service potential lets you automatically attach a help article link in the first response for tickets where the answer is clearly documented:
{
"name": "self_service_potential",
"categories": [
"high",
"partial",
"requires_agent"
]
}
high means the ticket is a known how-to question with a clear doc. partial means documentation exists but the customer's specific situation may need follow-up. requires_agent means there's no self-service path - a human has to handle it.
Pairing this with neutral sentiment reduces average handle time considerably. You're not deflecting frustrated customers - you're just giving well-disposed customers a faster path to resolution.
Language detection
If your team handles tickets in multiple languages, a language classifier routes non-English tickets to the right agent before anyone wastes time trying to respond in the wrong language:
{
"name": "ticket_language",
"categories": [
"english",
"spanish",
"french",
"german",
"portuguese",
"other"
]
}
Add this as a fifth batch item and you have full coverage. If the label is other, route to a triage agent who can identify the language and reassign.
What to avoid
A few patterns that make stacked classification harder to maintain than it needs to be:
Too many categories per layer. If a single schema has more than eight or nine categories, the classifier starts making fine-grained distinctions that the routing logic downstream doesn't actually use. Prune aggressively. If two categories always get handled the same way, merge them.
Overlapping categories across layers. If your type schema has urgent_bug and your urgency schema has critical, you're encoding urgency in two places. Keep each layer focused on one dimension and let the combination of labels do the nuanced work.
Auto-routing everything regardless of confidence. Set a threshold per schema. Below 0.70 confidence on ticket type, send to a catch-all queue. Below 0.65 on urgency, default to normal and let an agent override. High-confidence results automate cleanly - low-confidence results get a human in the loop. In practice, this catches fewer than 10% of tickets.
Running the full stack on every ticket source. Not every channel needs every layer. Internal tickets from employees probably don't need churn risk detection. Batch requests from a data pipeline don't need urgency classification. Build the layer stack appropriate to each ticket source.
What it looks like in production
A production support triage pipeline using stacked classification ends up looking roughly like this:
- Ticket arrives via Zendesk trigger, Jira automation, or a webhook from your web form
- A single batch call sends the ticket body to three to five classifiers simultaneously
- Results come back in under 400ms total
- Automation logic reads the label combination and sets queue, priority, SLA tier, and any escalation flags
- If any layer returns below-threshold confidence, the ticket goes to a catch-all with a
needs-reviewtag - The agent opens their queue and sees every ticket already categorized, prioritized, and tagged - no triage step remaining
The result isn't just faster routing. It's a structured dataset of every ticket your team has ever received, labeled consistently, that you can actually analyze. Which ticket types come in most on Mondays? What's the average urgency level of tickets from customers in their first 30 days? What percentage of your frustrated tickets resolve without a follow-up? None of that was visible when triage was done manually and inconsistently. It becomes visible the moment classification is the layer that does it.
Free plan. No credit card. Create your schemas and start classifying in minutes.
Get started free