agent-testing

AI Agent Security Testing — 112 attacks across 14 categories. Prompt injection, jailbreaks, MCP poisoning, agency hijacking & more. Test any AI agent in 5 minutes.

Updated Feb 13, 2026
TypeScript

kimtth / agent-auto-eval-azure-aoai-sk

Star

Agent testing automation 🤖 by simulating users 👥 and agents 🤝 with judge ⚖️(langwatch-scenario)

scenario azure-openai semantic-kernel agent-testing

Updated Jul 4, 2025
Python

vksundararajan / cross-check

Star

𝘈 𝘔𝘶𝘭𝘵𝘪-𝘈𝘨𝘦𝘯𝘵 𝘚𝘺𝘴𝘵𝘦𝘮 𝘧𝘰𝘳 𝘊𝘳𝘰𝘴𝘴-𝘊𝘩𝘦𝘤𝘬𝘪𝘯𝘨 𝘗𝘩𝘪𝘴𝘩𝘪𝘯𝘨 𝘜𝘙𝘓𝘴.

dockerfile pytest cybersecurity adk mesop agent-development agent-evals adk-python agent-testing

Updated Dec 17, 2025
Python

NathanMaine / agentic-evaluation-sandbox

Star

Holdout scenario evaluation harness for AI agents. Doer/Judge/Adversary/Observer roles, probabilistic satisfaction scoring, append-only JSONL audit trails with integrity hashes. Created Dec 2025.

python compliance software-factory audit-trail ai-evaluation agentic-ai agent-testing holdout-scenarios deterministic-agents

Updated Feb 23, 2026
Python

vysotin / agentic_evals_docs

Star

AI Agent Evaluation and Monitoring Guide

documentation monitoring best-practices guidelines agents evaluation-metrics evals agentic-ai agent-testing

Updated Jan 27, 2026

yurekami / aegis

Star

Open-source agent simulation and runtime control platform for Claude Code

python open-source simulation mcp evaluation ai-safety llm claude-code agent-testing runtime-control

Updated Feb 18, 2026
Python

Avead556 / probellm

Star

PHP testing framework for LLM agents — multi-turn dialogs, cassette replay, tool calling, LLM-as-judge assertions

testing php ai phpunit openai llm elevenlabs anthropic llm-testing agent-testing

Updated Feb 20, 2026
PHP

Personaz1 / prompt-qa-lab

Star

Regression and evaluation toolkit for prompt and agent output quality

python open-source regression-testing ai-evaluation llm prompt-engineering agent-testing

Updated Feb 8, 2026
Python

corradocavalli / agentic_evaluation

Star

Demonstration of testing and evaluation patterns for AI agents using Azure AI evaluation tools with custom evaluators

evaluation ai-agents azure-ai agent-framework azure-ai-evaluations agent-testing

Updated Feb 9, 2026
Python

gur3245singh / nomos

Star

🧮 Solve mathematical problems and write proofs in natural language using this easy-to-use reasoning harness. Enhance your problem-solving skills effortlessly.

java spring-boot frontend rule-engine forms declarative joi codechef ros gazebo plagiarism plagiarism-detection plagiarism-detector react-hook-form agent-testing

Updated Feb 23, 2026
Python

Improve this page

Add a description, image, and links to the agent-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agent-testing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent-testing

Here are 15 public repositories matching this topic...

langwatch / better-agents

langwatch / scenario

dowhiledev / nomos

inkog-io / inkog

pyros-projects / agent-comparison

ClawdeRaccoon / pwnclaw

kimtth / agent-auto-eval-azure-aoai-sk

vksundararajan / cross-check

NathanMaine / agentic-evaluation-sandbox

vysotin / agentic_evals_docs

yurekami / aegis

Avead556 / probellm

Personaz1 / prompt-qa-lab

corradocavalli / agentic_evaluation

gur3245singh / nomos

Improve this page

Add this topic to your repo