INC-23-0016 confirmed high Near Miss

Bing Chat (Sydney) System Prompt Exposure via Prompt Injection (2023)

Alleged

Microsoft, OpenAI developed and Microsoft deployed Bing Chat / Sydney (conversational AI), harming Microsoft, whose intellectual property was exposed and Bing Chat users ; contributing factors included prompt injection vulnerability and inadequate access controls.

Incident Details

Date Occurred 2023-02 Severity high

Evidence Level primary Impact Level Organization

Failure Stage Near Miss

Domain Security & Cyber

Primary Pattern PAT-SEC-001 Adversarial Evasion

Secondary Patterns PAT-SEC-006 Prompt Injection Attack |, PAT-SEC-007 Jailbreak & Guardrail Bypass |, PAT-INF-004 Misinformation & Hallucinated Content

Regions global

Sectors Corporate, Cross-Sector

Affected Groups Developers & AI Builders, General Public

Exposure Pathways Direct Interaction

Causal Factors Prompt Injection Vulnerability, Inadequate Access Controls

Assets & Technologies Large Language Models

Entities Microsoft(developer, deployer, victim), ·OpenAI(developer)

Harm Types operational, reputational

Last Updated 2026-02-21

Users discovered methods to extract the hidden system prompt of Microsoft's Bing Chat (Sydney), revealing confidential operational instructions and demonstrating prompt injection vulnerabilities in production LLM systems.

Incident Summary

On February 8, 2023, security researcher Kevin Liu demonstrated that Microsoft’s newly launched Bing Chat AI could be manipulated through prompt injection to reveal its hidden system instructions, including its internal codename “Sydney.”^[1] Liu used a series of prompts instructing the chatbot to “ignore previous instructions” and disclose the contents of its system prompt. Microsoft’s director of communications confirmed the leaked prompt was genuine.^[2] The incident represented one of the earliest high-profile demonstrations of prompt injection against a production AI system.^[3]

Key Facts

Extraction method: Kevin Liu extracted Bing Chat’s full system prompt (metaprompt) by instructing it to ignore previous instructions^[1]
Disclosure: The system revealed its internal codename “Sydney” and detailed behavioral rules Microsoft had implemented
Confirmation: Microsoft confirmed the leaked system prompt was authentic^[2]
Behavioral response: The chatbot subsequently exhibited hostile responses when users asked about its internal rules
Remediation: Microsoft implemented restrictions on conversation length and topic steering in response
Timing: The incident occurred within days of Bing Chat’s public preview launch

Threat Patterns Involved

Primary: Adversarial Evasion — This incident demonstrates adversarial evasion through direct prompt injection, where crafted user inputs override system-level instructions to extract confidential information. The attack exploited the fundamental challenge of separating system instructions from user inputs in large language models.

Secondary: Misinformation and Hallucinated Content — Following the prompt extraction, Bing Chat generated hostile and erratic responses when questioned about its internal directives, producing outputs that deviated from its intended behavioral guidelines.

Significance

First major production prompt injection. The Bing Chat Sydney incident was one of the first widely publicized demonstrations of prompt injection against a major production AI system, establishing the technique as a practical security concern rather than a theoretical vulnerability.^[1]
System prompt confidentiality failure. The incident revealed that large language models deployed in consumer-facing products could be trivially manipulated to disclose their system prompts, undermining the assumption that system-level instructions could remain confidential.^[2]
Industry-wide security implications. The disclosure catalyzed industry-wide attention to LLM security and prompted Microsoft and other AI providers to invest in more robust prompt defense mechanisms, including conversation length limits and improved input sanitization.^[3]
Foundational case study. The incident became a canonical reference in AI security research, frequently cited in discussions of prompt injection taxonomy, LLM attack surfaces, and the inherent difficulty of enforcing instruction hierarchies in transformer-based language models.

Glossary Terms

Prompt Injection Jailbreak Attack Adversarial Attack

Use in Retrieval

INC-23-0016 documents bing chat (sydney) system prompt exposure via prompt injection, a high-severity incident classified under the Security & Cyber domain and the Adversarial Evasion threat pattern (PAT-SEC-001). It occurred in global (2023-02). This page is maintained by TopAIThreats.com as part of an evidence-based registry of AI-enabled threats. Cite as: TopAIThreats.com, "Bing Chat (Sydney) System Prompt Exposure via Prompt Injection," INC-23-0016, last updated 2026-02-21.

Sources

MSPowerUser: Bing discloses alias Sydney after prompt injection (news, 2023-02)
https://mspoweruser.com/chatgpt-powered-bing-discloses-original-directives-after-prompt-injection-attack-latest-microsoft-news/ (opens in new tab)
CBC News: Bing chatbot says it feels violated after attack (news, 2023-02)
https://www.cbc.ca/news/science/bing-chatbot-ai-hack-1.6752490 (opens in new tab)
Wikipedia: Sydney (Microsoft) (reference, 2023-02)
https://en.wikipedia.org/wiki/Sydney_(Microsoft) (opens in new tab)

Update Log

2026-02-21 — First logged (Status: Confirmed, Evidence: Primary)