Skip to main content
TopAIThreats home TOP AI THREATS

AI Defensive Methods

Neutral, evidence-based reference pages documenting detection, prevention, and enterprise monitoring methods for AI-enabled threats.

17 methods across 3 categories

Detection Methods

Adversarial Input Detection

Detection

Techniques for identifying inputs crafted to cause AI model misclassification or misbehavior, including perturbation analysis, input validation, certified defenses, and adversarial example detection.

2 threat patterns

AI Bias & Fairness Auditing

Detection

Frameworks and tools for evaluating AI systems for discriminatory outcomes, including statistical parity testing, disparate impact analysis, intersectional auditing, and algorithmic accountability methodologies.

5 threat patterns

AI Phishing Detection Methods

Detection

Technical approaches for detecting AI-generated phishing campaigns, including LLM-output classifiers, behavioral email analysis, AI-enhanced threat intelligence, and organizational controls.

2 threat patterns

AI-Generated Text Detection Methods

Detection

Technical approaches for identifying text produced by large language models, including statistical classifiers, watermark detection, stylometric analysis, and their documented limitations.

2 threat patterns

Data Poisoning Detection Methods

Detection

Technical approaches for identifying malicious modifications to AI training data, including statistical outlier detection, provenance tracking, dataset integrity verification, and model behavior analysis.

2 threat patterns

Deepfake Detection Methods

Detection

Technical approaches for identifying AI-generated or AI-manipulated visual and audio media, including forensic analysis, neural network classifiers, and provenance verification.

2 threat patterns

Voice Cloning Detection Methods

Detection

Technical approaches for identifying AI-generated or cloned speech audio, including spectral analysis, liveness detection, neural network classifiers, and procedural verification.

2 threat patterns