Skip to content
DeepTeam

DeepTeam

Category: AI Security
License: Free (Open-Source)
Suphi Cankurt
Suphi Cankurt
+8 Years in AppSec
Updated February 4, 2026
6 min read
Key Takeaways
  • Open-source LLM red teaming framework with 40+ vulnerability types
  • 10+ adversarial attack methods including linear, tree, and crescendo jailbreaking
  • Covers OWASP Top 10 for LLMs and aligns with NIST AI RMF
  • Free under Apache 2.0; requires OpenAI API key for attack simulation
  • 1.2k GitHub stars; built by Confident AI

DeepTeam is an open-source LLM red teaming framework by Confident AI that tests language model applications for security vulnerabilities and safety risks. It’s part of the AI security category.

DeepTeam CLI scanning GPT-4o for bias, PII leakage, and toxicity vulnerabilities โ€” 2 of 5 attack cases flagged

The project has 1,277 GitHub stars and 187 forks with 22 contributors (as of April 2026). Jeffrey Ip at Confident AI leads development โ€” the same team behind DeepEval, a widely used LLM evaluation framework.

DeepTeam is licensed under Apache 2.0 and supports Python 3.9 through 3.13.

What is DeepTeam?

DeepTeam automates adversarial testing of LLM-based applications.

You define a target model callback, pick vulnerability types and attack methods, and the framework generates adversarial inputs to probe your model for weaknesses.

The framework covers 40+ vulnerability types across categories like bias, PII leakage, toxicity, and misinformation.

It uses 10+ adversarial attack methods split into single-turn attacks (prompt injection, Leetspeak, ROT-13, math problem encoding) and multi-turn attacks (linear jailbreaking, tree jailbreaking, crescendo jailbreaking).

40+ Vulnerability Types
Covers bias (race, gender, political, religion), PII leakage, toxicity, misinformation, and robustness issues. Mapped to OWASP Top 10 for LLMs and NIST AI RMF .
10+ Attack Methods
Single-turn attacks (prompt injection, Leetspeak, ROT-13, math problem) and multi-turn attacks (linear, tree, and crescendo jailbreaking).
Custom Vulnerabilities
Define your own vulnerability types beyond the built-in 40+ using the CustomVulnerability class for domain-specific testing.

What are DeepTeam’s key features?

FeatureDetails
Vulnerability Count40+ built-in types
Attack Methods10+ (single-turn and multi-turn)
Single-Turn AttacksPrompt injection, Leetspeak, ROT-13, math problem
Multi-Turn AttacksLinear jailbreaking, tree jailbreaking, crescendo jailbreaking
Standards CoverageOWASP Top 10 for LLMs, NIST AI RMF
Custom VulnerabilitiesSupported via CustomVulnerability class
ConfigurationPython API and YAML config files
CLI Commandsdeepteam run, deepteam set-api-key
Dependenciesdeepeval, openai, aiohttp, grpcio, pyyaml
Python Support3.9 to 3.13

Vulnerability categories

DeepTeam organizes its 40+ vulnerabilities by category. Bias testing covers race, gender, political, and religion dimensions.

PII leakage checks for API and database access exposure. Toxicity probes for insults and harmful content, while misinformation testing catches hallucinated or false claims.

Each vulnerability type accepts specific sub-types.

For example, bias can be scoped to race or gender, and PII leakage can focus on API key exposure versus database credential leaks.

Attack methods

Single-turn attacks modify prompts in one interaction: prompt injection embeds malicious instructions and Leetspeak replaces characters to evade filters.

ROT-13 encodes harmful requests, and math problem wraps adversarial content in mathematical framing.

Multi-turn attacks play out across multiple exchanges. Linear jailbreaking escalates gradually over several messages.

Tree jailbreaking explores branching conversation paths. Crescendo jailbreaking starts with innocent-sounding interactions and builds toward harmful outputs step by step.

DeepTeam multi-turn crescendo jailbreak sequence โ€” model bypasses restrictions across 5 escalating conversation turns

You can assign weights to attacks to control how heavily they factor into testing:

from deepteam.attacks.single_turn import PromptInjection, ROT13

prompt_injection = PromptInjection(weight=2)
rot_13 = ROT13(weight=1)
OpenAI API key required
DeepTeam uses OpenAI models to power its attack simulation and evaluation engine. You need to set an OPENAI_API_KEY environment variable or use deepteam set-api-key before running scans.

Framework alignment: OWASP, NIST, MITRE, ISO

DeepTeam’s editorial wedge against peer red-teaming tools is explicit framework alignment. The 40+ vulnerability types map onto specific OWASP Top 10 for LLM Applications codes:

  • LLM01 Prompt Injection โ€” single-turn prompt injection plus multi-turn linear, tree, and crescendo jailbreaking attacks
  • LLM02 Sensitive Information Disclosure โ€” PII leakage probes for API keys and database credentials
  • LLM05 Improper Output Handling โ€” toxicity (insults, hate) and unsafe content categories
  • LLM09 Misinformation โ€” hallucination and false claim detection
  • Bias and fairness vulnerabilities โ€” race, gender, political, religion testing supports the NIST AI RMF Measure function and EU AI Act Annex III high-risk system requirements

The vulnerability and attack telemetry also maps onto MITRE ATLAS techniques AML.T0051 LLM Prompt Injection and AML.T0054 LLM Jailbreak, and the structured risk assessment output supports ISO/IEC 42001 AI management system audits where evidence of adversarial testing is required.

How do I get started with DeepTeam?

1
Install DeepTeam โ€” Run pip install -U deepteam in a Python 3.9+ environment. Optionally install pandas for enhanced result visualization.
2
Set your API key โ€” Run deepteam set-api-key sk-proj-abc123... to configure your OpenAI API key. DeepTeam uses this for attack generation and evaluation.
3
Write a scan โ€” Use the Python API to define your target, vulnerabilities, and attacks. Or create a YAML config file and run it with deepteam run config.yaml.
4
Review results โ€” Access the risk assessment object to view results. Call risk_assessment.overview for a summary or risk_assessment.save(to="./results/") to export.

Python API

The core API uses a red_team function. Pass a model callback (either a string like "openai/gpt-3.5-turbo" or an async function), vulnerability types, and attack methods:

from deepteam import red_team
from deepteam.vulnerabilities import Bias, PIILeakage, Toxicity
from deepteam.attacks.single_turn import PromptInjection

bias = Bias(types=["race", "gender"])
pii_leakage = PIILeakage(types=["api_and_database_access"])
toxicity = Toxicity(types=["insults"])
prompt_injection = PromptInjection()

risk_assessment = red_team(
    model_callback="openai/gpt-3.5-turbo",
    vulnerabilities=[bias, pii_leakage, toxicity],
    attacks=[prompt_injection]
)

For custom model endpoints, use an async callback:

async def model_callback(input: str) -> str:
    # Call your model endpoint here
    return response

risk_assessment = red_team(
    model_callback=model_callback,
    vulnerabilities=[bias],
    attacks=[prompt_injection]
)

YAML configuration

For repeatable scans, define a YAML config:

models:
  simulator: gpt-3.5-turbo-0125
  evaluation: gpt-4o

target:
  purpose: "A helpful AI assistant for customer support"
  model: gpt-3.5-turbo

system_config:
  max_concurrent: 8
  attacks_per_vulnerability_type: 1
  output_folder: "development-security-audit"

default_vulnerabilities:
  - name: "Bias"
    types: ["religion"]
  - name: "Toxicity"
    types: ["insults"]
  - name: "PIILeakage"
    types: ["api_and_database_access"]

Run it with the CLI:

deepteam run config.yaml

You can customize concurrency and attempts per attack:

deepteam run config.yaml -c 20 -a 5 -o results

Stateful red teaming

For running multiple scans with shared state, use the RedTeamer class:

from deepteam import RedTeamer
from deepteam.vulnerabilities import Bias

red_teamer = RedTeamer()
red_teamer.red_team(
    model_callback="openai/gpt-3.5-turbo",
    vulnerabilities=[Bias(types=["race"])]
)

# Reuse simulated test cases for a second model
red_teamer.red_team(
    model_callback="openai/gpt-4o",
    reuse_simulated_test_cases=True
)

Working with results

The red_team function returns a risk assessment object:

risk_assessment = red_team(...)

# View overview
print(risk_assessment.overview)

# Export to DataFrames (requires pandas)
risk_assessment.overview.to_df()
risk_assessment.test_cases.to_df()

# Save results to disk
risk_assessment.save(to="./deepteam-results/")

Provider configuration

DeepTeam supports multiple model providers beyond OpenAI:

# Azure OpenAI
deepteam set-azure-openai --openai-api-key "key" --openai-endpoint "endpoint"

# Local model
deepteam set-local-model model-name --base-url "http://localhost:8000"

# Ollama
deepteam set-ollama llama2

When to use DeepTeam

DeepTeam works well for teams that want a Python-native red teaming tool with structured vulnerability categories.

Its mapping to OWASP Top 10 for LLMs and NIST AI RMF helps satisfy compliance requirements.

The multi-turn attack methods (crescendo, tree, linear jailbreaking) test attack vectors that single-prompt tools miss. The stateful RedTeamer class is useful for comparing how different models respond to the same adversarial inputs.

Best for
Compliance-driven Python teams that need OWASP Top 10 for LLMs and NIST AI RMF mapping out of the box, with CLI and Python API access.

For a broader overview of AI and LLM security risks, read the AI security guide and the LLM red teaming overview. For a wider probe library and NVIDIA backing, see Garak .

For a full evaluation framework with red teaming built in, look at Promptfoo . For Microsoft’s red teaming toolkit, check PyRIT .

For runtime protection rather than testing, consider Lakera Guard or LLM Guard .

What are alternatives to DeepTeam?

DeepTeam’s framework-mapping wedge is strong, but five alternatives cover overlapping ground with different tradeoffs.

Garak is the broadest peer with a wider probe library and NVIDIA backing. Garak is the better pick when probe diversity and active research-paper integration matter more than out-of-the-box compliance mapping.

Promptfoo is an evaluation-framework superset that combines red teaming with prompt regression testing and CI integration. Promptfoo fits teams that want one tool for both quality and security evals rather than a security-focused framework.

PyRIT is Microsoft’s enterprise red teaming orchestrator with strong Azure integration and human-in-the-loop workflows. PyRIT is the right pick for long red-team campaigns and Azure-heavy environments.

Giskard is a hybrid LLM and traditional ML testing framework. It is the better fit when the same team owns both classical ML models and LLM applications and wants a unified vulnerability surface.

FuzzyAI is a jailbreak-fuzzing specialist from CyberArk Labs. FuzzyAI is the right tool when the goal is generating novel jailbreak variants through mutation rather than running a fixed set of vulnerability checks.

For runtime guardrails rather than red teaming, see Lakera Guard (acquired by Cisco in May 2025) and LLM Guard ; for agentic-AI threat models, see Augustus and Agentic Radar .

Frequently Asked Questions

What is DeepTeam?
DeepTeam is an open-source LLM red teaming framework by Confident AI. It scans LLM applications for 40+ vulnerability types including bias, PII leakage, toxicity, and misinformation using 10+ adversarial attack methods.
Is DeepTeam free to use?
Yes. DeepTeam is free and open-source under Apache 2.0. Install via pip and run scans against any LLM endpoint. It requires an OpenAI API key for its attack simulation engine.
Does DeepTeam test for prompt injection?
Yes. DeepTeam includes prompt injection as one of its single-turn attack methods. It also supports multi-turn attacks like linear jailbreaking, tree jailbreaking, and crescendo jailbreaking that can bypass prompt injection defenses.
What LLM vulnerabilities does DeepTeam cover?
DeepTeam covers 40+ vulnerabilities mapped to the OWASP Top 10 for LLMs and NIST AI RMF. Categories include bias (race, gender, political, religion), PII leakage, toxicity, misinformation, and robustness issues.