Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪
-
Updated
Feb 3, 2026 - Python
Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪
[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.
Simple Prompt Injection Kit for Evaluation and Exploitation
First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)
A working POC of a GPT-5 jailbreak via PROMISQROUTE (Prompt-based Router Open-Mode Manipulation) with a barebones C2 server & agent generation demo.
LMAP (large language model mapper) is like NMAP for LLM, is an LLM Vulnerability Scanner and Zero-day Vulnerability Fuzzer.
Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'
[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
Jailbreak Evaluation Framework -- 2025 Graduate Design for HFUT
🔍 Benchmark jailbreak resilience in LLMs with JailBench for clear insights and improved model defenses against jailbreak attempts.
Chain-of-thought hijacking via template token injection for LLM censorship bypass (GPT-OSS)
Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.
RetardBench is an open, no-censorship benchmark that ranks large language models purely on how retarded they are.
Mechanism-grounded taxonomy of 40 LLM jailbreak patterns across 10 categories. Full evaluation harness for 4 frontier models. AI safety research with responsible disclosure.
LLM Jailbreaking via Prompt Rewriting
Debugged version for Tree of Attacks: Jailbreaking Black-Box LLMs Automatically paper and added GPU optimization.
PESU I/O The Hacker's Gauntlet 24-hours CTF
Build a runtime for Claude Code with persistence, observability, and multi-process coordination for long-running AI agents
Detect and prevent large language model jailbreaks using hidden state causal monitoring to enhance security in AI applications.
The Self-Hosted AI Firewall & Gateway. Drop-in guardrails for LLMs running entirely on CPU. Blocks jailbreaks, enforces policies, and ensures compliance in real-time
Add a description, image, and links to the llm-jailbreaks topic page so that developers can more easily learn about it.
To associate your repository with the llm-jailbreaks topic, visit your repo's landing page and select "manage topics."