There’s a lot of buzz around Model Communication Protocol (MCP) right now — and for good reason. It represents a shift from traditional static architectures like RAG, toward something more dynamic and reactive. While RAG has served us well in grounding LLMs with factual context, it’s time to rethink how models interact with knowledge, especially in environments where truth itself is in motion.
What if, instead of front-loading the context and hoping the model gets it right, we let the model decide when something needs to be looked up? Even better: what if the model noticed when something it believed was no longer true?
Welcome to the emerging world of MCP and "truth drift".
RAG is a Batch Operation
Most RAG systems today look like this:
User → Embed → Vector Search → Context Injection → Prompt → LLM Response
That works fine when:
- You know what to ask up front
- The knowledge base is stable
- Latency isn’t critical
But real-time systems don’t work like that. News changes. Markets fluctuate. Systems emit telemetry. Once generation begins, RAG lacks the ability to adapt.
It’s a static mechanism in a dynamic environment.
Enter MCP: Model Communication Protocol
MCP is a new pattern that wraps LLMs in an event-driven shell. Models stream responses, emit tool calls mid-thought, wait for results, and resume reasoning. Like this:
User → LLM (streaming...) → [Tool Call: lookup_entity] → Router → Vector Search → Return → Resume → Final Answer
It’s RAG, but reactive. It’s tools, but event-driven. It’s agentic, but with structure.
And because it’s streaming and aware, MCP makes it possible to track truth as it changes.
What is Truth Drift?
Truth drift is what happens when a belief held by the model—explicitly or implicitly—starts to diverge from current reality.
- A company thought to be "financially healthy" suddenly posts record losses
- A peaceful region becomes a hotspot
- A person listed as CEO is replaced
The problem isn’t that the LLM doesn’t know. It’s that its understanding hasn’t been updated.
Traditionally, you'd need to re-run a query. But with an MCP-style architecture, you can watch for drift:
- Define a hypothesis (e.g. "Company X is financially stable")
- Feed a real-time stream (e.g. news headlines)
- Use vector or LLM judgment to determine if that belief is still valid
- Trigger a tool or response only when the truth materially shifts
Now the model isn’t just responding. It’s watching.
A Model That Reflects on Truth
Here’s where it gets interesting. You can measure vector drift directly by tracking how far the representation of new information deviates from a baseline embedding. This allows you to detect semantic shift without having to reload the entire context.
For example, you might embed a hypothesis like "Company X is financially stable" using your preferred embedding model and store that as a reference. As real-time content arrives — say, news headlines — you embed those in the same space and compute the cosine similarity:
from scipy.spatial.distance import cosine baseline = embed("Company X is financially stable") latest = embed("Company X shares fall 40% after earnings miss") distance = cosine(baseline, latest) if distance > 0.3: trigger_alert()
The important point is that you can checkpoint the baseline once and measure change incrementally, without rebuilding the full prompt context each time.
Tracking Truth Drift with Vector Distance
If you'd prefer not to rely solely on model introspection, you can also measure semantic drift quantitatively using vector distance. By embedding a baseline hypothesis — for instance, "Company X is financially stable" — you can checkpoint its position in vector space. Then, as new information streams in, such as real-time news updates, you embed each new snippet and compare it against the baseline.
A simple cosine similarity calculation gives you a measure of how far the meaning has shifted:
from scipy.spatial.distance import cosine baseline = embed("Company X is financially stable") latest = embed("Company X shares fall 40% after earnings miss") distance = cosine(baseline, latest) if distance > 0.3: trigger_alert()
This approach is lightweight, avoids full context reloads, and enables continuous monitoring. You can even maintain a rolling average or use time-weighted embeddings to manage volatility over time.
Of course, you can also ask the model directly:
“Given the following new information, has this statement materially changed in truth?”
This isn’t embedding math. It’s meta-cognition.
The model uses its latent associations to check whether its beliefs hold up. It’s not just generating text — it’s performing a semantic integrity check.
And in systems like LangGraph, you can route based on that result:
- True → no action
- False → update vector store, trigger alert, call new tool
Truth becomes state. Drift becomes a signal.
Truth-Aware Architectures Are Coming
Real-time systems need more than prompt stuffing and pre-queries. They need models that:
- Monitor the world
- Judge semantic change
- Trigger actions
- Evolve statefully
MCP makes that possible by enabling models to interact with dynamic information sources over time, rather than relying on static context baked in at the beginning of generation. This architectural shift turns models from passive responders into active monitors — capable of detecting change, reasoning about it, and taking action.
Where RAG integrates retrieval, MCP introduces a full communication protocol. It facilitates contextual awareness, the ability to evaluate when prior assumptions no longer hold, and to respond with appropriate tools or strategies. This elevates the role of the model from fact-fetcher to reasoning agent, giving it the capacity to manage evolving truth states over long-lived sessions.