The canon-drift problem

Why AI keeps forgetting your D&D campaign.

Every "Claude invented a new pantheon" or "ChatGPT got my NPC's backstory wrong" frustration comes from the same architectural problem. Here's what's actually going on, what doesn't fix it, and what does.

By Zach Townes, GM and founder of Grimoire · Last updated June 8, 2026

Free tier, no credit card. Bring your own AI client.

The problem

  1. AI
  2. "search for war god"
  3. fuzzy text search
  4. several pages mention "war"
  5. Guesses Tempus. You catch it. AI apologizes. You move on.

The fix

  1. AI
  2. get_world_rule("war_domain")
  3. typed query
  4. one canonical answer
  5. Tells you Helm.

The canon-drift problem

Three failure modes, one architectural cause.

Canon drift is when an AI assistant confidently states things that aren't true in your campaign: a god you never created, an NPC relationship that doesn't exist, a backstory invented whole. AI forgets your campaign because it's searching text, not querying structured state.

GMs using AI for campaign prep keep hitting the same wall. The model writes confidently, sometimes brilliantly, but periodically asserts things that aren't true in their world. A god the GM didn't put in their pantheon. A relationship between NPCs that doesn't exist. A backstory invented from whole cloth, in a voice that sounds exactly like the times the model got it right.

This isn't a model-quality problem. The newest, smartest model in the world still does this when its only window into your world is a fuzzy text search over a wiki. The failure happens in three predictable shapes.

1

The context ceiling

The model can only "see" what fits in its context window. You can paste your pantheon, your NPC list, your faction map, but every campaign session adds more lore than fits. Eventually you're choosing what to paste, and the model is working from a subset of your canon.

2

Text-blob retrieval

Even when the model has search access to your wiki via a tool, what it gets back is text excerpts ranked by fuzzy text matching. It can't tell which page is canonical and which is a stub. It can't tell which fact is current and which was superseded three sessions ago. It's reading a search excerpt and guessing.

3

The hallucination floor

When the model can't find a canonical answer, it invents one, with the same confident voice it uses when it does know the answer. You can't easily distinguish "the model remembered this from my wiki" from "the model made this up because it didn't find anything."

These aren't bugs. They are the structural consequences of asking a language model to be your campaign's memory using nothing more than search-over-text retrieval.

What doesn't fix it

The fixes that sound right but don't address the cause.

"Use a bigger context window."

Larger context windows let you paste more lore, but you still hit the ceiling on long campaigns. And bigger contexts don't fix the retrieval problem, they make it worse by giving the model more text to wade through.

The model still has to guess which paragraph is canonical when two of them conflict.

"Use a smarter model."

GPT-5, Claude Opus, Gemini Ultra, they all do the same thing when handed a fuzzy-search-over-text retrieval interface. The smartness of the model doesn't change the shape of the retrieval.

A genius searching a haystack is still searching a haystack.

"Organize your wiki better."

Adding tags, backlinks, structured headers, "CANON" labels, none of these reach the model through text-search retrieval. The search returns matched excerpts; whatever you put in the page headers gets clipped out of most excerpts.

You're organizing for yourself, not for the model.

"Use ChatGPT Projects" or pinned context.

This is a better wrapper around the same shape. You pin documents; the model still reads them as text and searches them with fuzzy matching.

Same failure modes, slightly later in the conversation arc.

"Use a bundled-AI campaign tool."

Tools that generate images and transcribe audio still tend to use text-over-wiki retrieval under the hood for canon questions. The AI-generated outputs are useful for their own jobs (a portrait, a session recap).

But they don't solve canon enforcement when you ask "who is the priestess of Helm in Verdantshire."

"Add a knowledge graph MCP server on top."

A real attempt: open-source knowledge graph MCPs let the model walk typed entity edges instead of fuzzy-searching text. It works for about a week. Then it drifts out of sync with your wiki, because they're two separate sources of truth you keep aligned by hand. The model starts reasoning correctly over stale data, which is worse than reasoning fuzzily over current data.

The retrieval shape improved; the sync problem multiplied.

"Model relationships as junction tables in a separate database."

A few wiki tools (Notion included) let you create a separate "relationships" or "junction" table for complex bidirectional links. Architecturally clean, until you realize the AI consistently fails to recognize the junction table exists. It finds the canonical entity page and stops.

The relationship-table rows become invisible content the AI never sees.

These are all reasonable things to try. None of them change the architectural cause.

What does fix it

Structured state. Typed entities. Queryable canon.

The retrieval problem goes away when the model can ask structured questions of typed data instead of fuzzy-searching a text blob. That means three architectural shifts.

Text wiki Structured state
Data shape Pages with prose Typed entities with typed fields and explicit relationships
Retrieval Search excerpts ranked by relevance Direct query: give me the canonical answer
Canon Implicit. Pages argue with each other via prose Explicit. Constitution and World Rules are marked authoritative
Hallucination Inevitable when search returns nothing Bounded. The model can ask "do you have this?" and get a clean yes/no

In a structured campaign state, "who holds the war domain in this campaign" is a typed query against a typed pantheon. The model gets back a literal answer (Helm), not a search excerpt that mentions both Helm and Tempus.

"Who is the high priestess of the Temple of Helm in Verdantshire" is a relationship query against the NPC table filtered by location and faction. The model gets back the canonical NPC if she exists, or a clean "no such NPC exists" if she doesn't. No invention.

The Constitution and World Rules sit above the entity layer and act as canonical truth that overrides anything else. The model is told, before any other context, what's authoritative in this world.

This is the architectural fix. It's not magic. It's not a smarter model. It's the same retrieval problem you'd solve in any database-backed application: give the consumer structured queries instead of full-text search.

How Grimoire implements this

Constitution + 14 typed entities + MCP.

Grimoire is a campaign manager built around the structured-state architecture above. The pieces:

The Constitution

A canonical set of facts and rules that anchors your world: pantheon, cosmology, geography, core rules. Each fact ships with two layers: the canonical truth itself, and an aiGuidance field that tells the AI how to use the fact in its outputs (tone, what to lean into, what to avoid). The AI gets this first, before anything else.

For example, a "Resonance Magic" world rule returns something like this to the AI:

{
  "name": "Resonance Magic",
  "importance": 5,
  "description": "Magic no longer flows from stable divine sources.
                  Casters must resonate with scattered power fragments.
                  Magic is regional, unpredictable, and intensely
                  personal.",
  "aiGuidance": "Magic is not reliable. Describe spells as humming,
                  flickering, or straining. Counterspelling disrupts
                  resonance, not formulas. Players should feel the
                  instability."
}

The aiGuidance field is the architectural difference between a wiki and a campaign state machine. The AI is told how to use the fact, not just what the fact is. No wiki retrieval can match this, because wikis store prose for human readers, not structured guidance for AI consumers.

14 typed entities with first-class relationships

NPC, Location, Faction, Quest, Item, PlayerCharacter, Creature, Vehicle, LoreEntry, WorldRule, PlanarForce, SessionRecap, SessionPrep, CustomMechanic, plus knowledge graphs that connect them. Each entity has typed fields (an NPC has a name, a description, a faction relationship, a location relationship). Relationships are first-class queryable objects in the same database as the entities: not a separate junction table you maintain by hand, not a shadow knowledge graph that drifts out of sync. One source of truth.

The MCP server

An open Model Context Protocol server that any MCP-compatible AI client (Claude.ai, Claude Desktop, ChatGPT via connector, Cursor, others) can connect to. The server exposes structured queries: get_entity, get_constitution, get_narrative_state, search_campaign, and around 40 others. The AI asks structured questions and gets structured answers.

Visibility tiers

Four levels (common knowledge, player knowledge, DM secret, inherit) so the AI only sees what you've authorized. Players get filtered views; the AI gets the layer you assign it.

This stack means when Claude (or ChatGPT, or any other MCP client) asks about your world, it's calling structured tool functions against your campaign's typed state. Not fuzzy-searching a wiki. Not guessing. The model can still write creatively (voices, scenes, ambient detail) but the canon underneath those scenes comes from your structured state, not the model's generic priors.

By the numbers

Measured comparison, same understanding goal.

A working campaign-state understanding (world rules, current narrative state, all entities, key relationships) takes a measurably different number of tokens depending on the underlying architecture.

The figures below come from a real Notion-plus-MCP setup measured over six months of prep, assembling that equivalent working state. Notion is the example here; any wiki-and-search tool would likely show the same inefficiency, because the bottleneck is the retrieval shape, not the product.

Wiki + AI Grimoire + AI
Tokens to load equivalent working state ~50,000 across 30+ fetches ~6,800 across 5 calls
Per-fact AI guidance Inferred from prose, if surfaced Structured aiGuidance field
Typed relationships Stored as prose links First-class typed edges
Attention filtering None GM-curated subset ("what matters this session")
Tier loading All-or-nothing fuzzy search Layered: load only what the current question needs

Roughly a 7x token ratio for equivalent coverage. The ratio matters because most GMs are not paying for Claude Pro or ChatGPT Plus. Grimoire was deliberately architected to fit inside free-tier message budgets, so AI-assisted prep stays sustainable for GMs running on Claude.ai's or ChatGPT's free plan.

If your campaign manager takes 50,000 tokens of context to understand your world before answering a single question, you're going to run out of free-tier capacity by lunchtime. If it takes 6,800, you can prep an entire session on the free plan without rate-limiting yourself out.

And because MCP is client-agnostic by design, the failover is trivial: hit Claude's free-tier ceiling mid-prep, close the tab, open ChatGPT, and pick up where you left off. Your campaign canon lives in Grimoire, not in the AI client. Any MCP-compatible AI can read it. End-to-end free across two free tiers, not just one.

This is the part of the architecture that doesn't show up in feature comparisons but determines whether the AI assistance actually fits your workflow.

Useful even if you don't use AI

Structure is the value. AI is a layer you can add.

A lot of GMs read this far and want to say: I don't use AI for prep, and I don't want to. Is Grimoire useful for me? Yes, for different reasons.

The same structure that makes AI queries exact also makes your own queries exact. In a typed-entity campaign manager, "every NPC in House Vale" or "every quest in the second act" or "every location my players haven't visited yet" is a query you can run in two seconds. In a text wiki, it's a search-and-pray.

The visibility tier system means players can have a live view of your world filtered to what they know, without you maintaining two wikis. The Constitution-and-World-Rules layer means your homebrew rules live in one canonical place you can reference at the table without paging through Notion.

If you connect AI later, the same structure that makes the wiki useful makes the AI integration tight. If you never do, you've still built a queryable lore bible that doesn't lose threads. The structure is the value. AI is a thing you can choose to layer on.

When this matters most

A short, honest list.

The canon-drift problem hits some GMs harder than others. This page is most useful for you if:

  • You run a heavily homebrewed campaign (custom pantheon, custom geography, custom history)
  • Your campaign has been running long enough that you can't fit all your lore in a single prompt
  • You're using or considering using AI for prep
  • You've been frustrated by the model getting things wrong in ways you can't easily prevent

If you run published modules, the canon-drift problem is smaller: the model has seen those modules during training. If you're running a one-shot, you can paste everything you need. The structural fix matters most for long, homebrewed, AI-assisted campaigns where the canon stack is too big to keep in your head and too specific for the model to know.

Questions, answered

Canon drift, in detail.

Why does AI forget my D&D campaign?

Because most AI setups read your campaign through fuzzy text search over a wiki. The model gets ranked text excerpts, not canonical answers, so it cannot tell which page is authoritative, which fact is current, or whether an answer exists at all. When the search returns nothing clean, the model invents an answer in the same confident voice it uses when it actually knows. It is an architectural problem, not a model-quality problem.

Does a bigger context window fix this?

No. A larger context window lets you paste more lore, but long campaigns still exceed it, and a bigger context actually makes retrieval worse by giving the model more text to wade through and more conflicting paragraphs to guess between. The fix is changing the retrieval shape, not the size of the window.

What is the architectural fix to AI canon drift?

Let the model ask structured questions of typed data instead of fuzzy-searching a text blob. In a structured campaign state, "who holds the war domain" is a typed query that returns a literal answer (Helm), and "who is the high priestess of the Temple of Helm" returns the canonical NPC or a clean "no such NPC exists." A Constitution and World Rules layer marks what is authoritative, so the model is grounded before it answers anything.

Do I need to use AI to benefit from Grimoire?

No. The same structure that makes AI queries exact also makes your own queries exact: "every NPC in House Vale" or "every location my players have not visited" is a two-second query instead of a search-and-pray. Visibility tiers give players a filtered live view without a second wiki. If you connect AI later, the structure makes the integration tight. If you never do, you still have a queryable lore bible that does not lose threads.

Does using a smarter AI model fix it?

No. GPT-5, Claude Opus, Gemini, handed a fuzzy-search-over-text interface, all behave the same way. The intelligence of the model does not change the shape of the retrieval. A genius searching a haystack is still searching a haystack. The fix is to stop searching text and start querying typed data.

Can a knowledge graph fix AI canon drift?

Partly, then it backfires. An open-source knowledge-graph MCP lets the model walk typed edges instead of fuzzy text, which is a real improvement, but it becomes a second source of truth you sync with your wiki by hand. Once they drift apart, the model reasons confidently over stale data, which is worse than reasoning fuzzily over current data. The durable fix is a single structured source of truth the AI queries directly.

The canon-drift problem is real. So is the fix.

Grimoire is one implementation of that fix. Free to start, bring your own AI client.