Back to Blog
EngineeringProduct
March 27, 2026
4 min read

Cereby Mini: Why We Moved to a Two-Intent Model

From many labeled intents to one simple decision: change the doc or answer in chat

How Cereby Mini used to decide what to do

Early versions of Cereby Mini treated understanding the user like filling out a form. We maintained a set of specific, named intents: things like writing structured content, running grammar checks, replacing or rewriting passages, extracting definitions, and similar operations. Each intent had clear boundaries in the product—great for demos, documentation, and test cases.

That model is intuitive for engineers: you map natural language to a small taxonomy of behaviors, then run the right pipeline. If the user said “fix the grammar here,” we aimed for the grammar intent. If they said “replace this paragraph with a shorter version,” we aimed for replace or rewrite.

Where fine-grained intents break down

The problem is not that users are vague—it is that real language does not respect your enum.

Someone might ask: “Make this document feel friendlier,” or “Add more emojis,” or “Turn this into bullet points that sound less stiff.” None of those requests are wrong. But if emoji density or tone tweaks are not first-class intents, the classifier faces a bad choice:

  • Force the request into the closest labeled bucket and risk the wrong tool or empty parameters
  • Fall through to a generic path and feel like a misfire
  • Reject the nuance and give a text-only answer when the user clearly wanted the document to change

We call the painful cases intent misfires: the system technically “understood” something, but not what the user actually wanted for their page. The more intents you add, the more edge cases you discover—and the more brittle the boundaries become between them.

The two-intent model

We simplified the first routing step dramatically. Instead of asking “which of N product intents is this?”, Cereby Mini now asks one question:

  1. Should we edit the document? — The user wants the canvas to change: insertions, rewrites, restructuring, styling-ish requests that should land in the doc, or anything that is best fulfilled as a proposed patch they can review.
  2. Should we respond in text? — The user wants an explanation, a plan, feedback, or an answer that belongs in the chat thread without automatically mutating the document.

That is it at the routing layer: document edit versus text response.

Downstream, we still run rich logic—diffs, safety checks, templates, grammar pipelines, and so on—but the initial fork is no longer a game of matching thousands of phrasings to dozens of labels.

Why two buckets cover more ground

A binary split is not a limitation; it is alignment.

  • Emergent requests (“more emojis,” “punchier openings,” “merge these sections”) naturally land on edit if the outcome should show up on the page, without needing a dedicated intent name in advance.
  • Clarifying and analytical requests (“why did you change this,” “what is wrong with my thesis statement,” “outline my next steps”) land on text response, where a conversational answer is appropriate.
  • You avoid a class of bugs where the model guesses the wrong label among near neighbors (grammar vs rewrite vs replace) and ships the wrong UX.

The goal is not to dumb down the assistant—it is to stop pretending that a long list of intents is the same thing as user intent. Users rarely think in our internal vocabulary; they think in outcomes.

Tradeoffs we accept

A coarser first step means we sometimes need second-pass refinement: after we know “edit vs chat,” we still decide how to edit or how deep to go. That is acceptable because those steps can use more context—selection, full document, attachments—without fighting the wrong top-level label.

We also invest in clear preview and accept/reject for edits, so when interpretation is ambiguous, the user still stays in control.

Takeaway

Cereby Mini became more reliable when we stopped trying to name every kind of ask up front. The two-intent modeledit the document or answer in text—reduces intent misfires, respects how people actually speak, and leaves room for new behaviors without adding a new label to the taxonomy every week.

If you are building document AI, ask whether your routing layer is solving the user’s problem or catalog maintenance. Often, the simpler fork is the more powerful one.