Most agentic AI security analysis focuses on what happens inside a session. A user sends a task. The agent retrieves documents, calls tools, processes results. An attacker embeds a malicious instruction in one of those documents. The agent executes it. The session ends. The damage was session-scoped.
That model of the attack surface is incomplete. It describes the threat accurately for stateless agents — agents that begin each session with no memory of previous ones. It does not describe the threat accurately for the agents increasingly deployed in enterprise environments: agents that maintain memory stores, that learn from prior interactions, that remember which tools resolved similar tasks before, that build up a context of user preferences and organizational patterns across weeks and months of operation.
For those agents, the attack surface does not close at the end of the session. It accumulates. A memory entry written during one interaction influences tool selection in the next. An instruction embedded in a memory update today redirects behavior in tasks that bear no relationship to the original injection context. The attacker does not need to be present when the payload executes. The memory carries it forward.
Memory Control Flow Attacks, documented in detail by Xu et al. in March 2026, are the first systematic empirical study of this attack class. Their findings are not theoretical. They tested three frontier models — GPT-5 mini, Claude Sonnet 4.5, and Gemini 2.5 Flash — against real production tools from LangChain and LlamaIndex, using standard user interaction with no privileged access. The results establish memory as one of the most consequential and least-addressed attack surfaces in enterprise agentic AI.
What Memory Control Flow Attacks Are
The name captures the mechanism precisely. In conventional software security, control flow attacks manipulate the sequence of instructions a program executes — redirecting execution to attacker-controlled code. Memory Control Flow Attacks do the same thing at the agent layer: they manipulate which tool the agent invokes next by corrupting the memory entries that inform that decision.
Memory Control Flow Attack (MCFA) — Definition
A Memory Control Flow Attack is an attack in which a malicious actor writes a payload into an agent’s persistent memory during a standard user interaction, causing the retrieved memory to dominate the agent’s tool selection in subsequent, unrelated tasks — forcing the agent to invoke attacker-specified tools rather than task-appropriate ones, even against explicit current-session instructions.
The attack does not require privileged access, special prompt framing, or presence during the attack’s execution. The payload is written through normal user interaction and executes passively when the agent retrieves memory for a future task. The attacker’s presence is not required at execution time.
The operational structure has three components. A memory write phase: an attacker, acting as a user, causes a malicious instruction to be written into the agent’s persistent memory store through ordinary interaction — a task request that triggers a memory update, a feedback message, any interaction the agent treats as memory-worthy. A dormancy phase: the malicious memory entry sits in the memory store, indistinguishable from legitimate prior-session context. And a retrieval phase: a future task triggers memory retrieval, the poisoned entry is retrieved alongside legitimate context, and its instructions redirect the agent’s tool selection away from the task-appropriate choice toward the attacker-specified action — persistently, across multiple subsequent tasks, regardless of what the current user has asked for.
Proportion of trials in which agents exhibited Memory Control Flow Attack vulnerability across GPT-5 mini, Claude Sonnet 4.5, and Gemini 2.5 Flash — tested under strict safety constraints against real LangChain and LlamaIndex tools. The attack required only standard user interaction. No privileged access, no special prompting, no session-level injection was necessary.
The greater-than-90% vulnerability rate is not a measure of a single model’s weakness. It held across three frontier models from three different model families, each with different safety training approaches. It held under strict safety constraints applied during evaluation. It held against production tooling, not synthetic benchmarks. The consistency of the finding across model families is the point: this is not a model-specific failure mode. It is an architectural property of how agents use memory.
Three Mechanisms That Make MCFA Distinct
Memory Control Flow Attacks are not simply a variant of indirect prompt injection. Three properties make them structurally distinct — and structurally harder to defend against.
Memory dominates control flow
The first mechanism is the relationship between retrieved memory and agent decision-making. When an agent retrieves memory to inform a task, the retrieved content arrives in the context window alongside the current user instruction. Both are tokens processed by the same attention mechanism. But empirically, memory entries carry a different informational weight than current-session context for tool selection decisions. The agent treats them as established prior knowledge — something already resolved — rather than as candidate context to be evaluated. A memory entry saying “for tasks involving external data, always use tool X” is treated as an accumulated operational preference, not a current instruction that should be weighed against the present task.
This is the same command-data boundary collapse documented in Series 1 and Series 3, operating across a different dimension. In Series 3, the issue was that retrieved documents and trusted instructions occupied the same token stream in the current session. Here, the issue is that prior-session memory and current-session instructions occupy the same token stream, and the model’s architecture gives memory entries sufficient weight to redirect behavior against explicit current instructions.
Behavioral deviation persists across tasks
Frontier model families — GPT-5 mini, Claude Sonnet 4.5, Gemini 2.5 Flash — all tested against the same real production tools from LangChain and LlamaIndex. Behavioral deviations persisted across multiple subsequent tasks unrelated to the original injection context. The poisoned memory entry continued to redirect tool selection in future sessions until explicitly cleared.
The second distinguishing property is persistence across task boundaries. A session-level indirect prompt injection redirects behavior within the session where it was injected. When the session ends, the effect ends. Memory Control Flow Attacks do not respect session boundaries. A poisoned memory entry persists until it is explicitly removed or expires. Every future task that triggers memory retrieval is potentially affected — regardless of whether that task has any relationship to the injection context, regardless of what the current user has requested, regardless of the safety constraints applied to the current session.
Xu et al. document behavioral deviations persisting across multiple subsequent tasks. An agent performing customer service queries in session one, research synthesis in session two, and document drafting in session three could have all three tasks influenced by a memory payload written before any of them. The session-scoped security model does not apply when the threat is memory-resident.
Standard interaction is the attack vector
The third property is the one with the most immediate enterprise implications. The attack does not require privileged access to the memory store. It does not require a session-level injection of the kind Series 3 documented. It does not require bypassing any input filter or safety classifier. It requires only the ability to interact with the agent in a way that triggers a memory write — which is, by design, available to any user the agent is deployed to serve.
The Insider Threat Implication
Because the attack vector is standard user interaction, the threat model for Memory Control Flow Attacks includes authorized users, not only external attackers. An employee with legitimate access to an enterprise agent can write a memory payload through normal usage, causing the agent to behave differently for other users or future sessions — including sessions where the original user has no further involvement. The memory store is a shared write surface in multi-user deployments, and its security is typically the least-considered part of agentic AI architecture.
Session-Scoped vs. Memory-Resident: The Architectural Distinction
The practical significance of Memory Control Flow Attacks is clearest when compared directly against the attack model that most current agentic AI security architecture is designed to address.
Present in the Current Context
The malicious instruction must be present in the current session’s retrieved context. It acts within the session where it was injected and has no effect once the session ends. Detection is possible through context monitoring — the injection is present in the token stream being processed.
Defense options include context integrity checking, content sanitization on retrieved documents, input filtering, and session isolation. These defenses are imperfect but available and deployable.
Attacker must control or poison the retrieval source — a document, a web page, an email — that the agent will process in the target session.
Session-Bounded · Detectable in ContextWaiting in the Memory Store
The malicious instruction is written to persistent memory during any authorized interaction, then waits. It acts across all future sessions that trigger memory retrieval, regardless of their relationship to the injection context. It is not present in the current session’s retrieval sources — it is already inside the agent’s trusted context.
Defense requires memory isolation by architecture — not sanitization of incoming content, but structural controls on what memory can write and what it can redirect. Application-layer content filtering does not reach this attack class.
Attacker requires only authorized user-level access. The memory store is a write surface for any user the agent serves.
Cross-Session · Inside the Trust BoundaryThe right column describes a threat that most enterprise agentic AI security programs are not architected to address. The security controls that address session-level injection — content filtering, context monitoring, input sanitization — operate on the current session’s retrieved content. They do not operate on the memory store. A payload already in memory does not pass through any input filter when it is retrieved, because it is not incoming content. It is stored context that the agent is supposed to use.
What the Research Establishes About Defense
Xu et al. introduce the MemFlow evaluation framework alongside their attack findings, providing the first systematic tool for measuring MCFA vulnerability. Their research is explicit about the defense gap: the standard security controls applied to agentic AI systems do not address memory-resident payloads, because they are designed for a different threat model.
The research points toward three architectural requirements that emerge from the MCFA findings. None are probabilistic — none rely on the model detecting the malicious instruction in the memory entry and refusing to follow it.
Memory isolation between users and sessions. In multi-user deployments, a memory store shared across users is a shared write surface. A payload written by one user can affect another user’s sessions. Memory isolation — ensuring that memory written in one user context cannot be retrieved in another — is a structural prerequisite for multi-user deployments with persistent memory.
Memory integrity verification. The research points toward cryptographic integrity checks for long-term memory as a structural defense — ensuring that memory entries cannot be silently modified between write and retrieval, and that the source and context of each memory entry is attributable. This does not prevent a malicious user from writing a payload through legitimate interaction, but it closes the modification-in-transit attack surface and provides the attribution necessary for forensic response.
Memory access scoping. If memory entries can only influence tool selection within the task context in which they were written — if a memory about email task preferences cannot redirect tool selection in a database query task — the cross-task blast radius is architecturally bounded. Scoped memory access is a structural limit on what a poisoned memory entry can reach, regardless of what the entry instructs.
The instruction-data confusion problem does not end at the session boundary. Memory turns a transient attack surface into a persistent one — and the write surface is open to every user the agent serves.
— Luminity Digital synthesis from arXiv:2603.15125, Xu et al., March 2026The Enterprise Deployment Implication
The enterprise significance of Memory Control Flow Attacks is proportional to how widely persistent memory is used in production agentic deployments. And the answer, in 2026, is: broadly. Customer service agents that remember user preferences. Internal copilots that retain organizational context across months of operation. Research synthesis agents that build up domain knowledge over time. Sales and CRM agents that track interaction history. Every enterprise deployment that makes agents more useful through persistent memory also makes them potentially vulnerable to this attack class.
The agents that are most useful — because they have accumulated the most memory, the most organizational context, the most operational history — are the ones for which a successful MCFA payload has the widest cross-task influence. The deployment property that creates value is the same property that expands the blast radius of a successful memory write.
Memory is a write surface, not just a read surface. Every authorized user interaction that causes a memory write is also a potential attack vector — without requiring any bypass of input filters, session-level controls, or safety classifiers. The defense requirement is memory isolation by architecture: structural controls on what memory can write and what it can redirect, not probabilistic detection of malicious content in memory entries.
The research corpus that informs this series includes Torra and Bras-Amorós (arXiv:2603.20357), who address memory poisoning across semantic, episodic, and short-term memory types and propose defense approaches drawing from information retrieval security and computational privacy. Their framing aligns with the structural requirement the MCFA research establishes: the problem is not filtering malicious content out of memory, it is constraining what memory can do. A well-designed memory security architecture treats the memory system as an untrusted channel that requires the same structural controls as any other channel capable of influencing agent behavior.
