Betterleaks is an open-source secrets scanner built by Zachary Rice, the original creator of Gitleaks (25,000+ GitHub stars). It detects and validates hardcoded credentials in git repositories, directories, and archives using BPE tokenization instead of entropy-based filtering, achieving 98.6% recall on the CredData benchmark compared to 70.4% with traditional entropy detection. Rice is currently Head of Secrets Scanning at Aikido Security.

The project was created on February 3, 2026 and reached v1.1.1 by March 17, 2026. It is written in Go, licensed under MIT, and designed as a drop-in replacement for Gitleaks with backwards-compatible configuration files and CLI flags.
What is Betterleaks?
Betterleaks is a free, open-source secrets detection tool that scans git repositories, directories, stdin, and compressed archives for hardcoded credentials such as API keys, tokens, and passwords. It replaces Shannon entropy with BPE (Byte Pair Encoding) tokenization using the cl100k_base model to determine whether a string is likely a real secret. On the CredData benchmark, this approach achieves 98.6% recall versus 70.4% for entropy-based scanning. Betterleaks is licensed under MIT and can be installed via Homebrew, Docker, DNF, or built from source.
How does Betterleaks improve on Gitleaks?
Betterleaks targets the same problem as Gitleaks – finding hardcoded secrets in git repositories – but improves detection accuracy, validation capability, and scanning speed.
The core difference is the detection engine. Gitleaks relies on Shannon entropy to distinguish random strings from real secrets. Betterleaks uses BPE tokenization with the cl100k_base model (the same tokenizer GPT-4 uses). On the CredData benchmark, Betterleaks hits 98.6% recall compared to Gitleaks’ 70.4% with entropy-based filtering. On large codebases, that gap means a significant number of secrets that entropy misses.
Betterleaks also adds CEL-based secrets validation. When it finds a potential credential, it can fire an HTTP request to the target service and check whether the credential is still live. A finding goes from “possible leak” to “confirmed active secret,” which changes how you prioritize remediation.
Since it is backwards-compatible with Gitleaks configuration files and CLI flags, migrating takes minimal effort. Existing .gitleaks.toml files work without modification.

The benchmark above (from the Betterleaks repository) compares scan times on three real-world repositories. With RE2 and 8 git workers enabled, Betterleaks scans the Rails repo in 5.8s vs Gitleaks’ 24.5s (4.2x faster), the Ruby repo in 10.3s vs 55.2s (5.4x faster), and the GitLab repo in 2m13s vs 11m28s (5.2x faster).
Key Features
| Feature | Details |
|---|---|
| CLI commands | git (scan repos), dir (scan directories), stdin (pipe input) |
| Configuration | TOML format (.betterleaks.toml or .gitleaks.toml), backwards-compatible with Gitleaks |
| Detection engine | BPE tokenization (cl100k_base) + regex rules; 98.6% recall on CredData |
| Secrets validation | CEL expressions fire HTTP requests to verify if leaked credentials are still active |
| Output formats | JSON, CSV, JUnit, SARIF, custom Go templates |
| Installation | Homebrew, Docker, DNF (Fedora), from source |
| Regex engines | Go stdlib or RE2 (switchable); RE2 guarantees linear-time matching |
| Recursive decoding | base64, hex, percent-encoding, unicode escapes; configurable depth (default 5) |
| Archive support | zip, tar, and nested archives via --max-archive-depth |
| Git scanning | Parallelized via --git-workers; scans GitLab repo 5.2x faster than Gitleaks |
| Composite rules | Multi-part patterns with proximity matching to reduce false positives |
| Redaction | --redact flag with configurable percentage (0-100%) for logs and stdout |
| Baseline support | --baseline-path to ignore known findings and track only new secrets |
| Language | Pure Go (no CGO) — deploys anywhere without native library dependencies |
| License | MIT (no commercial restrictions) |
What is the Token Efficiency Filter?
The Token Efficiency Filter is Betterleaks’ core detection innovation that replaces Shannon entropy with BPE (Byte Pair Encoding) tokenization for identifying secrets. Entropy-based detection measures the randomness of a string to decide whether it might be a secret, but many real secrets don’t have high enough entropy to pass the threshold, and many non-secrets (like UUIDs or hashes) score high entropy but aren’t credentials.
Betterleaks uses the cl100k_base tokenizer (the same tokenizer GPT-4 uses) to evaluate how efficiently a string compresses into tokens. Real secrets tokenize inefficiently because they are random, while structured strings (variable names, UUIDs, file paths) compress well.
On the CredData benchmark, the Token Efficiency Filter produces 98.6% recall versus 70.4% with Shannon entropy. In my testing, this translated to fewer missed secrets without a noticeable jump in false positives.
How does CEL-based secrets validation work?
CEL-based secrets validation is Betterleaks’ mechanism for determining whether a detected credential is still active and exploitable. Finding a secret is useful, but knowing whether it still works is what decides how fast you need to act.
Betterleaks uses CEL (Common Expression Language) expressions to define validation logic per rule. When a rule matches, the CEL expression can fire an HTTP request to the target API and check the response. If the credential returns a valid response, the finding is marked as confirmed-active rather than just a potential leak.
This is similar to what TruffleHog does with its built-in verifiers. The key difference: Betterleaks makes the validation logic user-configurable via CEL expressions, so security teams can write custom verification for internal APIs and services. TruffleHog’s verifiers are hardcoded per detector.
What are composite and multi-part rules?
Composite rules in Betterleaks combine a primary regex pattern with auxiliary patterns that must appear within a specified proximity in the source code. This approach reduces false positives for patterns that only matter near related identifiers – for example, a random-looking string is only flagged as an API key if a service name like STRIPE_KEY or aws_secret appears nearby. Betterleaks inherited this capability from Gitleaks and extended it with proximity matching configuration.
Does Betterleaks support recursive decoding?
Yes. Betterleaks recursively decodes base64, hex, percent-encoding, and unicode escape sequences before applying detection rules. The decoding depth is configurable (default 5 levels). This catches secrets that developers have obfuscated or that build tools have encoded during packaging – a common pattern in older codebases where credentials end up base64-encoded in configuration files or environment variable exports.
Does Betterleaks scan inside archives?
Betterleaks scans inside compressed archives including zip, tar, and nested archive formats via the --max-archive-depth flag. This ensures secrets hiding in vendored dependencies, bundled artifacts, or release packages don’t get missed during audits.
Can you switch regex engines in Betterleaks?
Betterleaks supports two regex engines: Go’s standard library regex engine and RE2. RE2 provides guaranteed linear-time matching, which matters when scanning large files with complex patterns. You can switch between them based on your performance and compatibility needs.
Who created Betterleaks?
Betterleaks was created by Zachary Rice (GitHub handle: zricethezav), the original author of Gitleaks, one of the most popular open-source secrets scanners with over 25,000 GitHub stars. Rice is currently Head of Secrets Scanning at Aikido Security. He started the Betterleaks project on February 3, 2026, building on the lessons learned from years of maintaining Gitleaks. The project is hosted on GitHub under the MIT license and accepts community contributions.
Use Cases
CI/CD pipeline scanning. Run Betterleaks in your CI pipeline to block pull requests that introduce secrets. The --git-workers flag keeps scan times reasonable even on large repositories. SARIF output feeds directly into GitHub Advanced Security.
Pre-commit hook. Install Betterleaks as a pre-commit hook to catch secrets before they reach version control. Same workflow as Gitleaks; existing pre-commit configurations work with minimal changes.
Incident response. When you discover a leaked credential, use CEL-based validation to check whether the secret is still active. That tells you whether rotation is urgent or can wait.
Legacy codebase audits. Recursive decoding and archive scanning help find secrets that are base64-encoded, hex-encoded, or tucked inside zip files, which is common in older codebases.
Getting Started

brew install betterleaks on macOS, or pull the Docker image with docker pull ghcr.io/betterleaks/betterleaks:latest. On Fedora, use dnf install betterleaks. You can also build from source with Go.betterleaks git /path/to/repo to scan git history for secrets. Use betterleaks dir /path/to/dir for non-git directories. Add --git-workers 4 for parallelized scanning and -v for verbose output..gitleaks.toml into the repository root. Betterleaks reads it natively. CLI flags are backwards-compatible, so just swap gitleaks for betterleaks in your scripts.--report-path results.json --report-format json to save findings. Validated secrets are marked as confirmed-active. Upload SARIF output to GitHub Advanced Security with --report-format sarif.Strengths & Limitations
Strengths:
- BPE tokenization measurably outperforms Shannon entropy for secret detection (98.6% vs 70.4% recall on CredData).
- CEL-based validation is user-configurable, unlike hardcoded verification in other tools.
- Drop-in Gitleaks replacement. No migration pain.
- Parallelized git scanning cuts wall-clock time on large repos.
- Recursive decoding catches encoded and obfuscated secrets.
- MIT license, no commercial restrictions.
Limitations:
- Very new project (created February 2026). The rule library is smaller than mature tools like Gitleaks or TruffleHog.
- 473 GitHub stars. Small community compared to Gitleaks (25k+) or TruffleHog (25k+). Ecosystem integrations (GitHub Actions, pre-commit hooks) are still catching up.
- No managed cloud platform. This is a CLI tool. Teams that want dashboards, team management, or hosted scanning should look at GitGuardian or TruffleHog’s commercial offering.
- CEL validation requires writing expressions per rule. Out-of-the-box coverage for common services is still limited.
How does Betterleaks compare to other secrets scanners?

| Feature | Betterleaks | Gitleaks | TruffleHog | GitGuardian |
|---|---|---|---|---|
| Detection method | BPE tokenization + regex | Entropy + regex | 800+ detectors | Pattern matching + ML |
| Secrets validation | CEL expressions (configurable) | No | Built-in verifiers (hardcoded) | Yes (commercial) |
| License | MIT | MIT | AGPL-3.0 | Freemium |
| Scan targets | Git, directories, stdin, archives | Git, directories, stdin | Git, Slack, S3, Docker, etc. | Git, CI/CD (commercial) |
| Parallelized scanning | Yes (–git-workers) | No | Yes | Yes |
| Recursive decoding | Yes (base64, hex, etc.) | Yes (v8.26+) | Limited | Yes |
| GitHub Stars | 473 | 25,500 | 25,100 | N/A |
Betterleaks fits best if you care about detection accuracy and configurable validation, especially if you’re already on Gitleaks and want a painless upgrade. TruffleHog is a better pick for teams that need scanning beyond git repos (Slack, S3, Docker images). GitGuardian is the way to go for enterprises that need dashboards, team management, and hosted scanning.