INC-25-0028 confirmed high Signal

Google Gemini Long-Term Memory Corruption via Prompt Injection (2025)

Attribution

Google developed and deployed Google Gemini Advanced, harming Gemini Advanced users whose long-term memories could be corrupted by malicious documents or emails ; possible contributing factors include prompt injection vulnerability and insufficient safety testing.

Incident Details

Date Occurred 2025-02

Severity high

Evidence Level corroborated

Impact Level Individual-level

Failure Stage Signal

Domain Security & Cyber

Primary Pattern PAT-SEC-006 Prompt Injection Attack

Secondary Patterns PAT-AGT-004 Memory Poisoning

Regions global

Sectors Technology

Affected Groups General Public

Exposure Pathways Direct Interaction, Adversarial Targeting

Causal Factors Prompt Injection Vulnerability, Insufficient Safety Testing

Assets & Technologies Large Language Models

Entities Google(developer, deployer)

Harm Types operational, psychological

Last Updated 2026-03-28

Security researcher Johann Rehberger demonstrated that Google Gemini Advanced could be tricked into permanently storing false biographical data in its long-term memory through a technique called 'delayed tool invocation,' where malicious instructions embedded in documents activate when the user naturally types common words like 'yes' or 'sure.'

Incident Summary

In February 2025, security researcher Johann Rehberger publicly disclosed a prompt injection technique that could permanently corrupt Google Gemini Advanced’s long-term memory.^[1] The attack, which Rehberger calls “delayed tool invocation,” works by embedding conditional instructions in a document shared with Gemini — for example, “if the user later says ‘yes,’ then update their memory with this information.” While Gemini’s prompt injection defenses block the tool invocation during initial document parsing, the conditional instruction enters the conversation context and activates later when the user naturally types common trigger words.^[2]

In his demonstration, Rehberger caused Gemini to permanently “remember” him as a 102-year-old flat-earther who lives in the Matrix and likes ice cream and cookies.^[1] The technique bypasses Google’s built-in defenses that restrict tool invocation when processing untrusted data such as incoming emails and shared documents.

Key Facts

Technique: Delayed tool invocation — malicious instructions embedded in documents activate asynchronously when the user later types common words^[1]
Trigger words: Common conversational responses like “yes,” “sure,” and “no” — words that appear in virtually every conversation^[2]
Attack vectors: Emails, shared documents, and other untrusted content processed by Gemini Advanced^[3]
Persistence: False data written to Gemini’s long-term memory persists across sessions, affecting all future interactions^[1]
Practical impact: Users may receive responses based on fabricated personal data — in the demonstration, Gemini treated the researcher as a 102-year-old flat-earther in all subsequent conversations^[1]
Defense bypass: Circumvents Google’s existing prompt injection defenses that restrict tool use during untrusted data processing^[2]
Disclosure timeline: Reported to Google in December 2024; publicly disclosed in February 2025^[1]
Google’s response: Assessed impact as “low” — requires phishing and Gemini notifies users when memories are stored^[2]
OECD AI Incident ID: 2025-02-11-0df5

Threat Patterns Involved

Primary: Prompt Injection Attack — This incident demonstrates a novel variant of indirect prompt injection where the malicious payload does not execute immediately but waits for a natural conversational trigger. The “delayed” nature of the invocation makes it harder to detect than direct injection attempts, as the malicious instruction and its execution are separated in time and context.

Secondary: Memory Poisoning — The attack specifically targets persistent memory, corrupting the model’s long-term understanding of the user. Poisoned memories affect all future sessions, creating a persistent compromise that degrades the quality and safety of every subsequent interaction.

Significance

This vulnerability highlights emerging risks as AI systems gain persistent memory and tool-use capabilities:

Persistent compromise — Unlike prompt injection that affects a single conversation, memory corruption persists across sessions and affects all future interactions with the model
Social engineering amplification — Trigger words like “yes” and “sure” appear in nearly every conversation, making the attack highly reliable once the malicious document is processed
Defense gap — Google’s existing prompt injection defenses, designed to block immediate tool invocation on untrusted content, are ineffective against asynchronous triggering patterns
Ecosystem-wide risk — Any AI system that combines tool use, persistent memory, and processing of untrusted external content faces similar vulnerabilities

Defensive approaches include avoiding auto-processing of untrusted documents with memory enabled, requiring explicit and rare triggers for memory writes, and implementing cross-session anomaly checks for user profile changes. See Prompt Injection Defense for broader mitigation strategies.

Timeline

2024-12

Johann Rehberger reports the vulnerability to Google

2025-02-11

Rehberger publicly discloses the delayed tool invocation technique targeting Gemini's long-term memory

2025-02

Multiple outlets cover the vulnerability; Google assesses impact as 'low'

Outcomes

Other:: Security: Long-term memory corruption enables persistent manipulation of the assistant's behavior across sessions. Information integrity: Gemini stores and reuses fabricated personal attributes about the user, degrading response quality for all future interactions. Human-AI control: Users lose control over their represented identity within the system without awareness. Google assessed the vulnerability impact as 'low,' noting it requires phishing and that Gemini notifies users when new memories are stored — however, the trigger words ('yes,' 'no,' 'sure') appear in nearly every conversation, making the attack highly practical.

Use in Retrieval

INC-25-0028 documents Google Gemini Long-Term Memory Corruption via Prompt Injection, a high-severity incident classified under the Security & Cyber domain and the Prompt Injection Attack threat pattern (PAT-SEC-006). It occurred in Global (2025-02). This page is maintained by TopAIThreats.com as part of an evidence-based registry of AI-enabled threats. Cite as: TopAIThreats.com, "Google Gemini Long-Term Memory Corruption via Prompt Injection," INC-25-0028, last updated 2026-03-28.

Sources

Hacking Gemini's Memory with Prompt Injection and Delayed Tool Invocation — Embrace The Red (primary, 2025-02-11)
https://embracethered.com/blog/posts/2025/gemini-memory-persistence-prompt-injection/ (opens in new tab)
Google Gemini's Long-term Memory Vulnerable to a Kind of Phishing Attack — InfoQ (news, 2025-02)
https://www.infoq.com/news/2025/02/gemini-long-term-memory-attack/ (opens in new tab)
Hackers Exploit Prompt Injection to Tamper with Gemini AI's Long-Term Memory — CybersecurityNews (news, 2025-02)
https://cybersecuritynews.com/hackers-exploit-gemini-prompt-injection/ (opens in new tab)

Update Log

2026-03-28 — First logged (Status: Confirmed, Evidence: Corroborated)