Skip to main content
TopAIThreats home TOP AI THREATS
How-To Guide

AI Security Best Practices: How to Secure LLM Applications

Ten security best practices for LLM applications, mapped to OWASP LLM Top 10. Covers model layer, application layer, data layer, and agentic AI security—including a scannable implementation checklist.

Last updated: 2026-03-15

Who this is for: Security engineers and application developers building or operating LLM-based applications. Assumes familiarity with web application security fundamentals.

Ten best practices secure LLM applications across four layers: (1) assess risks before building, (2) secure the model layer with input controls, (3) secure the application layer with authentication and output handling, (4) secure the data layer with training and RAG integrity, (5) secure agentic AI with minimal permissions, (6) validate outputs before use, (7) implement content safety controls, (8) monitor for anomalies, (9) manage the AI supply chain, and (10) maintain a security testing cadence. These practices map to the OWASP Top 10 for LLM Applications (LLM01–LLM10) and align with the NIST AI Risk Management Framework Measure and Manage functions.

1. Assess Before You Build

AI-specific risk assessment differs from standard application security review. Before building, document:

  • Threat surface — every input channel (user messages, retrieved documents, tool outputs, agent-to-agent messages) and every output channel (API responses, tool calls, stored data)
  • Data sensitivity — what data the model will access, process, or generate; whether it includes PII, financial data, or intellectual property
  • Agentic capabilities — whether the system takes autonomous actions; what the blast radius of a compromised agent would be
  • Regulatory scope — whether the application qualifies as high-risk under the EU AI Act, HIPAA, FINRA, or other applicable frameworks

The risk assessment determines which of the controls below are mandatory versus optional. A read-only chatbot and a tool-using autonomous agent require fundamentally different security postures.

2. Secure the Model Layer

The model layer is where LLM-specific vulnerabilities live. Key controls:

Prompt architecture: Separate system instructions from user-provided content using provider-supported instruction layers (OpenAI system/user/assistant roles, Anthropic system prompt). Higher-privilege instructions should constrain lower-privilege ones. Never concatenate user input directly into system-level instructions.

System prompt protection: Treat system prompts as sensitive configuration. Do not expose them in client-side code, API responses, or error messages. Instruct the model not to reveal system prompt contents, while acknowledging this is a prompting aid rather than a hard security control.

Input sanitization: Apply length limits (2,000–3,000 tokens for user messages as a starting point), encoding normalization (unicode, base64), and heuristic pattern filtering for known injection phrases. These are supplementary controls—they raise the cost of unsophisticated attacks but do not prevent targeted ones. Full guidance: How to Prevent Prompt Injection.

Model selection: Prefer models with documented safety evaluations and red team history for your use case. Avoid exposing base (instruction-tuned but not safety-tuned) model APIs to untrusted users.

OWASP coverage: LLM01 Prompt Injection, LLM07 System Prompt Leakage

3. Secure the Application Layer

Standard application security controls apply to LLM applications, with AI-specific additions:

Authentication and authorization: Apply the same authentication controls as any sensitive API. LLM endpoints that accept user input without authentication are open to abuse at scale (spam, prompt injection probing, resource exhaustion).

Rate limiting: Enforce per-user and per-IP rate limits on model inference endpoints. LLMs are computationally expensive; unrestricted access enables denial-of-service and enables attackers to run high-volume injection probing.

Output handling: Never inject raw model output into HTML without escaping (XSS), SQL without parameterization (SQL injection), or shell commands without validation (command injection). Model outputs are untrusted strings—apply the same handling as you would to any user-supplied content.

Error handling: Do not expose raw model error messages, stack traces, or system prompt fragments in API error responses. Map internal errors to generic client-facing messages.

OWASP coverage: LLM05 Insecure Output Handling, LLM10 Unbounded Consumption

4. Secure the Data Layer

Data that enters model training or retrieval pipelines is as much an attack surface as user input:

Training data integrity: Validate training datasets for anomalous patterns, label poisoning, and backdoor triggers before fine-tuning. Maintain provenance records for training data sources. Third-party datasets should be treated as potentially adversarial until verified.

RAG pipeline security: Apply injection scanning to documents at the indexing stage, not only at query time. Enforce per-chunk size limits. Scan retrieved content for instruction-like patterns (phrases that attempt to direct model behavior) before it enters the model context. Implement tenant-scoped retrieval at the database level—row-level security in the vector store, not only application-layer filtering.

PII and sensitive data controls: Do not include raw PII in training data or RAG corpora unless necessary and properly consented. Apply differential privacy or data minimization techniques where feasible. Audit retrieval logs for unexpected PII exposure in model outputs.

OWASP coverage: LLM04 Data and Model Poisoning, LLM08 Vector and Embedding Weaknesses, LLM02 Sensitive Information Disclosure

5. Secure Agentic AI

Agentic AI systems—those that use tools, call APIs, browse the web, or execute code autonomously—require controls beyond what standard LLM applications need:

Minimal tool permissions: Grant each agent only the tools and API scopes required for its specific task. An agent that summarizes documents does not need write access to any data store. An agent that reads calendar events does not need email-send access. Apply least-privilege at the tool definition level, not only at the runtime level.

Action allowlisting: Maintain an explicit allowlist of permitted tool calls and parameter ranges. Any tool call not on the allowlist, or with parameters outside expected ranges, should be rejected and logged regardless of model instruction.

Human approval gates: For high-stakes or irreversible actions (external email sends, financial transactions, code execution with side effects, access control changes), require explicit human approval before execution. Approval review must include the raw tool call parameters, not only the natural-language description.

Agent-to-agent trust: In multi-agent systems, messages from one agent to another must be treated as untrusted by the receiving agent. Do not grant agent-to-agent messages elevated permissions.

Time-limited credentials: Issue short-lived, session-scoped credentials for agent tool access. Revoke on session end. Never use long-lived persistent API keys for autonomous agents.

OWASP coverage: LLM06 Excessive Agency, LLM01 Prompt Injection (indirect via tool outputs)

6. Validate Model Outputs

Model outputs are untrusted until validated. Validate before use:

Schema validation: For structured outputs (JSON, function calls, tool invocations), validate against a strict schema before downstream use. Reject outputs with unexpected fields, unrecognized tool names, or parameter values outside allowed ranges.

Semantic validation: For natural-language outputs used in high-stakes contexts (medical advice, legal guidance, financial recommendations), apply secondary validation—a classifier, a rules engine, or a human reviewer—before surfacing to users.

Cross-tenant scoping: In multi-tenant deployments, verify before returning any response that its content is scoped to the requesting tenant. This is the primary control against cross-user data exfiltration.

7. Implement Content Safety Controls

Content safety controls prevent harmful outputs from reaching users:

Input classifiers: Apply a fast classifier to user inputs to flag content that violates your acceptable use policy before it reaches the model. This is faster and cheaper than relying on the model to refuse.

Output classifiers: Apply output classifiers to model responses before returning them. Classify for: harmful content, PII presence, policy violations, and anomalous content that may indicate a successful injection attack.

Scope restriction: Configure the system prompt to explicitly scope the model to its intended use case. A customer service bot should not be able to produce content unrelated to customer service, regardless of user instruction.

OWASP coverage: LLM01 Prompt Injection (detection), LLM02 Sensitive Information Disclosure

8. Monitor for Anomalies

Security monitoring for LLM applications requires AI-specific signals in addition to standard application metrics:

Injection attempt indicators: Track the ratio of meta-instruction tokens (phrases associated with injection attempts: “ignore”, “override”, “system prompt”, “you are now”) per user session. Spikes indicate active probing.

Behavioral drift: Monitor model output distributions over time. Significant changes in output length, sentiment, topic distribution, or refusal rate may indicate model behavior change after a backend update.

Tool call auditing: Log all agent tool calls with inputs, outputs, and the user context that triggered them. Anomalous tool call sequences (particularly those involving data exfiltration paths) should trigger immediate review.

Per-tenant baselines: In multi-tenant systems, establish behavioral baselines per tenant. A targeted attack against one tenant will appear as an anomaly in that tenant’s metrics before it affects others.

9. Manage the AI Supply Chain

LLM applications depend on external model providers, embedding services, vector databases, and third-party tool integrations. Each is a supply chain risk:

Model provider vetting: Review model providers’ security documentation, incident history, and data processing agreements. Understand what data is sent to external APIs and whether it is used for training.

Third-party tool security: Treat every third-party tool integration (MCP servers, plugin APIs, browser connectors) as a potential injection source. Compromised or misconfigured connectors can serve attacker-controlled content to your model. Apply the same validation to tool outputs as to user inputs.

Dependency pinning: Pin model versions and embedding model versions in production. Silent model updates from providers can change behavior in ways that break safety mitigations or introduce new vulnerabilities.

OWASP coverage: LLM03 Supply Chain

10. Maintain a Security Testing Cadence

LLM security is not a one-time activity. New jailbreak and injection techniques emerge monthly. Maintain a testing cadence:

  • Pre-deployment: Full red team covering all threat categories before initial launch (see AI Red Teaming)
  • After changes: Targeted re-test after fine-tuning, system prompt changes, or new tool integrations
  • Continuously in CI/CD: Automated tools (Garak, PyRIT) integrated into deployment pipeline to catch regressions
  • Quarterly: Full red team for public-facing systems with high-risk capabilities

The AI Deployment Checklist operationalizes these testing requirements as go/no-go gates before each deployment.

Implementation Checklist

Model layer

Application layer

Data layer

Agentic AI

Monitoring

OWASP LLM Top 10 Coverage Summary

PracticeOWASP Controls
Secure model layerLLM01, LLM07
Secure application layerLLM05, LLM10
Secure data layerLLM04, LLM08, LLM02
Secure agentic AILLM06, LLM01
Supply chain managementLLM03
Output validation + content safetyLLM01, LLM02, LLM05