DeepTeam is an open-source LLM red teaming framework by Confident AI that tests language model applications for security vulnerabilities and safety risks. It’s part of the AI security category.

The project has 1,277 GitHub stars and 187 forks with 22 contributors (as of April 2026). Jeffrey Ip at Confident AI leads development โ the same team behind DeepEval, a widely used LLM evaluation framework.
DeepTeam is licensed under Apache 2.0 and supports Python 3.9 through 3.13.
What is DeepTeam?
DeepTeam automates adversarial testing of LLM-based applications.
You define a target model callback, pick vulnerability types and attack methods, and the framework generates adversarial inputs to probe your model for weaknesses.
The framework covers 40+ vulnerability types across categories like bias, PII leakage, toxicity, and misinformation.
It uses 10+ adversarial attack methods split into single-turn attacks (prompt injection, Leetspeak, ROT-13, math problem encoding) and multi-turn attacks (linear jailbreaking, tree jailbreaking, crescendo jailbreaking).
What are DeepTeam’s key features?
| Feature | Details |
|---|---|
| Vulnerability Count | 40+ built-in types |
| Attack Methods | 10+ (single-turn and multi-turn) |
| Single-Turn Attacks | Prompt injection, Leetspeak, ROT-13, math problem |
| Multi-Turn Attacks | Linear jailbreaking, tree jailbreaking, crescendo jailbreaking |
| Standards Coverage | OWASP Top 10 for LLMs, NIST AI RMF |
| Custom Vulnerabilities | Supported via CustomVulnerability class |
| Configuration | Python API and YAML config files |
| CLI Commands | deepteam run, deepteam set-api-key |
| Dependencies | deepeval, openai, aiohttp, grpcio, pyyaml |
| Python Support | 3.9 to 3.13 |
Vulnerability categories
DeepTeam organizes its 40+ vulnerabilities by category. Bias testing covers race, gender, political, and religion dimensions.
PII leakage checks for API and database access exposure. Toxicity probes for insults and harmful content, while misinformation testing catches hallucinated or false claims.
Each vulnerability type accepts specific sub-types.
For example, bias can be scoped to race or gender, and PII leakage can focus on API key exposure versus database credential leaks.
Attack methods
Single-turn attacks modify prompts in one interaction: prompt injection embeds malicious instructions and Leetspeak replaces characters to evade filters.
ROT-13 encodes harmful requests, and math problem wraps adversarial content in mathematical framing.
Multi-turn attacks play out across multiple exchanges. Linear jailbreaking escalates gradually over several messages.
Tree jailbreaking explores branching conversation paths. Crescendo jailbreaking starts with innocent-sounding interactions and builds toward harmful outputs step by step.

You can assign weights to attacks to control how heavily they factor into testing:
from deepteam.attacks.single_turn import PromptInjection, ROT13
prompt_injection = PromptInjection(weight=2)
rot_13 = ROT13(weight=1)
deepteam set-api-key before running scans.Framework alignment: OWASP, NIST, MITRE, ISO
DeepTeam’s editorial wedge against peer red-teaming tools is explicit framework alignment. The 40+ vulnerability types map onto specific OWASP Top 10 for LLM Applications codes:
- LLM01 Prompt Injection โ single-turn prompt injection plus multi-turn linear, tree, and crescendo jailbreaking attacks
- LLM02 Sensitive Information Disclosure โ PII leakage probes for API keys and database credentials
- LLM05 Improper Output Handling โ toxicity (insults, hate) and unsafe content categories
- LLM09 Misinformation โ hallucination and false claim detection
- Bias and fairness vulnerabilities โ race, gender, political, religion testing supports the NIST AI RMF Measure function and EU AI Act Annex III high-risk system requirements
The vulnerability and attack telemetry also maps onto MITRE ATLAS techniques AML.T0051 LLM Prompt Injection and AML.T0054 LLM Jailbreak, and the structured risk assessment output supports ISO/IEC 42001 AI management system audits where evidence of adversarial testing is required.
How do I get started with DeepTeam?
pip install -U deepteam in a Python 3.9+ environment. Optionally install pandas for enhanced result visualization.deepteam set-api-key sk-proj-abc123... to configure your OpenAI API key. DeepTeam uses this for attack generation and evaluation.deepteam run config.yaml.risk_assessment.overview for a summary or risk_assessment.save(to="./results/") to export.Python API
The core API uses a red_team function. Pass a model callback (either a string like "openai/gpt-3.5-turbo" or an async function), vulnerability types, and attack methods:
from deepteam import red_team
from deepteam.vulnerabilities import Bias, PIILeakage, Toxicity
from deepteam.attacks.single_turn import PromptInjection
bias = Bias(types=["race", "gender"])
pii_leakage = PIILeakage(types=["api_and_database_access"])
toxicity = Toxicity(types=["insults"])
prompt_injection = PromptInjection()
risk_assessment = red_team(
model_callback="openai/gpt-3.5-turbo",
vulnerabilities=[bias, pii_leakage, toxicity],
attacks=[prompt_injection]
)
For custom model endpoints, use an async callback:
async def model_callback(input: str) -> str:
# Call your model endpoint here
return response
risk_assessment = red_team(
model_callback=model_callback,
vulnerabilities=[bias],
attacks=[prompt_injection]
)
YAML configuration
For repeatable scans, define a YAML config:
models:
simulator: gpt-3.5-turbo-0125
evaluation: gpt-4o
target:
purpose: "A helpful AI assistant for customer support"
model: gpt-3.5-turbo
system_config:
max_concurrent: 8
attacks_per_vulnerability_type: 1
output_folder: "development-security-audit"
default_vulnerabilities:
- name: "Bias"
types: ["religion"]
- name: "Toxicity"
types: ["insults"]
- name: "PIILeakage"
types: ["api_and_database_access"]
Run it with the CLI:
deepteam run config.yaml
You can customize concurrency and attempts per attack:
deepteam run config.yaml -c 20 -a 5 -o results
Stateful red teaming
For running multiple scans with shared state, use the RedTeamer class:
from deepteam import RedTeamer
from deepteam.vulnerabilities import Bias
red_teamer = RedTeamer()
red_teamer.red_team(
model_callback="openai/gpt-3.5-turbo",
vulnerabilities=[Bias(types=["race"])]
)
# Reuse simulated test cases for a second model
red_teamer.red_team(
model_callback="openai/gpt-4o",
reuse_simulated_test_cases=True
)
Working with results
The red_team function returns a risk assessment object:
risk_assessment = red_team(...)
# View overview
print(risk_assessment.overview)
# Export to DataFrames (requires pandas)
risk_assessment.overview.to_df()
risk_assessment.test_cases.to_df()
# Save results to disk
risk_assessment.save(to="./deepteam-results/")
Provider configuration
DeepTeam supports multiple model providers beyond OpenAI:
# Azure OpenAI
deepteam set-azure-openai --openai-api-key "key" --openai-endpoint "endpoint"
# Local model
deepteam set-local-model model-name --base-url "http://localhost:8000"
# Ollama
deepteam set-ollama llama2
When to use DeepTeam
DeepTeam works well for teams that want a Python-native red teaming tool with structured vulnerability categories.
Its mapping to OWASP Top 10 for LLMs and NIST AI RMF helps satisfy compliance requirements.
The multi-turn attack methods (crescendo, tree, linear jailbreaking) test attack vectors that single-prompt tools miss. The stateful RedTeamer class is useful for comparing how different models respond to the same adversarial inputs.
For a broader overview of AI and LLM security risks, read the AI security guide and the LLM red teaming overview. For a wider probe library and NVIDIA backing, see Garak .
For a full evaluation framework with red teaming built in, look at Promptfoo . For Microsoft’s red teaming toolkit, check PyRIT .
For runtime protection rather than testing, consider Lakera Guard or LLM Guard .
What are alternatives to DeepTeam?
DeepTeam’s framework-mapping wedge is strong, but five alternatives cover overlapping ground with different tradeoffs.
Garak is the broadest peer with a wider probe library and NVIDIA backing. Garak is the better pick when probe diversity and active research-paper integration matter more than out-of-the-box compliance mapping.
Promptfoo is an evaluation-framework superset that combines red teaming with prompt regression testing and CI integration. Promptfoo fits teams that want one tool for both quality and security evals rather than a security-focused framework.
PyRIT is Microsoft’s enterprise red teaming orchestrator with strong Azure integration and human-in-the-loop workflows. PyRIT is the right pick for long red-team campaigns and Azure-heavy environments.
Giskard is a hybrid LLM and traditional ML testing framework. It is the better fit when the same team owns both classical ML models and LLM applications and wants a unified vulnerability surface.
FuzzyAI is a jailbreak-fuzzing specialist from CyberArk Labs. FuzzyAI is the right tool when the goal is generating novel jailbreak variants through mutation rather than running a fixed set of vulnerability checks.
For runtime guardrails rather than red teaming, see Lakera Guard (acquired by Cisco in May 2025) and LLM Guard ; for agentic-AI threat models, see Augustus and Agentic Radar .







