LLM Guard

LLM Guard

Category: AI Security
License: Free (Open-Source)

LLM Guard is an open-source security toolkit developed by Protect AI that provides guardrails for LLM-based applications, with 2.5k GitHub stars and 342 forks.

The library offers input and output scanners that detect prompt injections, anonymize PII, filter toxic content, and enforce compliance policies.

GitHub: protectai/llm-guard

LLM Guard works with any language model and integrates with popular frameworks like LangChain and LlamaIndex.

What is LLM Guard?

LLM Guard addresses the security and compliance challenges of deploying large language models in production.

When users interact with LLM-powered applications, organizations face risks including prompt injection attacks, sensitive data exposure, toxic content generation, and non-compliant outputs.

The toolkit provides modular scanners that pre-process user inputs and post-process model outputs.

Input scanners filter malicious prompts, anonymize personal information, and detect jailbreak attempts before they reach your model.

Output scanners validate responses, ensure compliance with content policies, and prevent data leakage.

Built by Protect AI, the company behind Guardian and ModelScan, LLM Guard reflects deep expertise in ML security.

The toolkit is designed for production use with minimal latency impact.

Key Features

Input Scanners

LLM Guard provides scanners for processing user prompts:

  • Prompt Injection: Detects direct and indirect prompt injection attacks
  • Anonymize: Replaces PII including names, emails, phone numbers, and credit cards
  • Ban Topics: Blocks prompts related to forbidden subjects
  • Language: Enforces input language restrictions
  • Secrets: Detects API keys, passwords, and credentials
  • Toxicity: Filters harmful, offensive, or abusive content

Output Scanners

Post-process model responses with output scanners:

  • Bias: Detects biased or discriminatory content
  • Deanonymize: Restores anonymized PII when appropriate
  • JSON: Validates JSON structure and schema compliance
  • Malicious URLs: Blocks links to known malicious sites
  • No Refusal: Ensures the model provides substantive responses
  • Relevance: Verifies responses match the user query

Performance Optimization

LLM Guard is optimized for production deployments with features like model caching, batch processing, and GPU acceleration.

Scanners can run in parallel to minimize latency impact.

Installation

Install LLM Guard via pip:

pip install llm-guard

For GPU acceleration:

pip install llm-guard[onnxruntime-gpu]

How to Use LLM Guard

Basic Input/Output Scanning

from llm_guard import scan_prompt, scan_output
from llm_guard.input_scanners import Anonymize, PromptInjection, Toxicity
from llm_guard.output_scanners import Deanonymize, NoRefusal, Relevance

# Define input scanners
input_scanners = [
    Anonymize(),
    PromptInjection(),
    Toxicity()
]

# Define output scanners
output_scanners = [
    Deanonymize(),
    NoRefusal(),
    Relevance()
]

# Scan user prompt
prompt = "My email is john@example.com and I need help with my account"
sanitized_prompt, results, valid = scan_prompt(input_scanners, prompt)

if valid:
    # Send to LLM
    response = your_llm(sanitized_prompt)

    # Scan model output
    sanitized_output, results, valid = scan_output(
        output_scanners, sanitized_prompt, response
    )

LangChain Integration

from langchain.llms import OpenAI
from llm_guard.langchain import LLMGuardPrompt, LLMGuardOutput

llm = OpenAI()

# Wrap with LLM Guard
guarded_llm = LLMGuardOutput(
    llm=LLMGuardPrompt(llm=llm, input_scanners=input_scanners),
    output_scanners=output_scanners
)

response = guarded_llm("Process this request safely")

API Server Mode

Run LLM Guard as a standalone API:

# Start the API server
llm-guard-api --host 0.0.0.0 --port 8080

# Scan via API
curl -X POST http://localhost:8080/v1/scan \
  -H "Content-Type: application/json" \
  -d '{"prompt": "User input here"}'

Integration

CI/CD Pipeline Testing

Test your guardrails configuration in CI:

name: LLM Guard Tests
on: [push, pull_request]

jobs:
  test-guardrails:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - name: Install LLM Guard
        run: pip install llm-guard pytest
      - name: Run Guardrail Tests
        run: pytest tests/guardrails/

Docker Deployment

FROM python:3.11-slim

WORKDIR /app
RUN pip install llm-guard[api]

EXPOSE 8080
CMD ["llm-guard-api", "--host", "0.0.0.0", "--port", "8080"]

Supported LLM Providers

LLM Guard works with any language model:

  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic (Claude)
  • Meta (Llama 2, Llama 3)
  • Mistral AI
  • Google (PaLM, Gemini)
  • Cohere
  • Local models via Ollama

When to Use LLM Guard

LLM Guard is the right choice when you need open-source guardrails for LLM applications with flexibility and transparency.

It suits organizations that want to self-host their security controls, need customizable scanning logic, or prefer to avoid vendor lock-in.

Consider LLM Guard for chatbots handling sensitive user data, internal AI assistants with access to proprietary information, or customer-facing applications where content moderation is required.

The library is particularly valuable when you need fine-grained control over what content reaches your models and what responses users receive.

For teams building LLM applications on frameworks like LangChain or LlamaIndex, LLM Guard provides native integrations that add security with minimal code changes.