content-safety

Star

Here are 44 public repositories matching this topic...

open-bias / open-bias

Star

Open Source Agent Alignment: Make your agents follow rules. One line of code to‎ ‎enforce, trace, and improve. ‎ ‎

Updated Apr 26, 2026
Python

roostorg / coop

Star

Review and moderation, your way. Online safety dashboard, queues, routing and automatic enforcement rules, and integrations.

trust-and-safety roost child-safety content-safety roost-tools roost-coop

Updated Apr 25, 2026
TypeScript

🛡️ Programmable Guardrails for LLM Applications in Java. A framework-agnostic toolkit for input/output validation, PII masking, and jailbreak detection. The Java alternative to NVIDIA NeMo Guardrails.

java library spring yaml-configuration input-validation jailbreak-detection ai-security guardrails toxicity-detection llm prompt-injection content-safety output-validation pii-masking

Updated Apr 15, 2026
Java

withinJoel / Content-Moderation-Engine

Sponsor

Star

A JavaScript-based content safety system designed to detect and filter sensitive media in real-time, ensuring platform compliance and user protection.

javascript compliance filtering content-safety content-safety-checks

Updated Jan 27, 2026
JavaScript

cristofima / TaskAgent-AgenticAI

Star

An intelligent task management assistant built with .NET, Next.js, Microsoft Agent Framework, AG-UI protocol, and Azure OpenAI, demonstrating Clean Architecture and autonomous AI agent capabilities

sql-server dotnet nextjs postgresql e2e-tests clean-architecture unit-tests agent-framework ai-agent vitest playright azure-open-ai content-safety ag-ui-protocol

Updated Feb 16, 2026
C#

Azure-Samples / rai-content-safety-workshop

Star

Step-by-Step tutorial that teaches you how to use Azure Safety Content - the prebuilt AI service that helps ensure that content sent to user is filtered to safeguard them from risky or undesirable outcomes

azure aiml workshop-materials responsible-ai content-safety

Updated Jul 1, 2024
Jupyter Notebook

RAGulatorAPP / RAGulator

Star

Transform uncertainty into absolute confidence.

javascript microsoft app azure teams net foundry metricas ai-agents rag entra-id content-safety document-intelligence-rag

Updated Mar 30, 2026
C#

RafaelParonis / jailbench

Star

🔍 Benchmark jailbreak resilience in LLMs with JailBench for clear insights and improved model defenses against jailbreak attempts.

python flask analytics openai alignment model-evaluation ai-safety security-testing red-teaming model-robustness anthropic litellm content-safety llm-jailbreaks tool-calling llm-benchmark ai-evals textual-tui

Updated Apr 26, 2026
Python

Dandona100 / SafeEyes

Star

│ Real-time NSFW & harmful content detection as a service

python docker ai safety image-classification nsfw content-moderation fastapi nsfw-detection content-safety

Updated Apr 7, 2026
Python

vibheksoni / jailbench

Star

Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.

Updated Aug 12, 2025
Python

inferwall / inferwall

Star

AI application firewall for LLM-powered apps — multi-layered detection (heuristic, ML classifier, semantic, LLM-judge) against prompt injection, jailbreaks, and data leakage - inferwall.com

python rust security machine-learning ai firewall jailbreak waf ai-safety onnx data-leakage llm prompt-injection llm-security content-safety

Updated Apr 13, 2026
Python

sammydeprez / presentations

Star

Technical presentations with hands-on demos

python machine-learning ai jupyter-notebook presentations educational ai-agents azure-ai responsible-ai azure-openai llm prompt-engineering langchain content-safety langgraph

Updated Apr 17, 2026
Jupyter Notebook

Napiersnotes / AlignmentVirusV6

Star

Production-Grade LLM Alignment Engine (TruthProbe + ADT)

alignment ai-safety llm safe-ai content-safety

Updated Jan 10, 2026
Python

Moshe-ship / sarih

Star

Arabic Content Moderator — scan text for toxicity, hate speech, spam. Dialect-aware. Fully offline.

nlp spam moderation arabic arabic-nlp hate-speech toxicity content-safety

Updated Apr 4, 2026
Python

OrenGrinker / contentSafetyFilter

Star

A Chrome extension that uses Claude AI to protect users under 18 from inappropriate content by analyzing webpage content in real-time.

nodejs chrome-extension typescript chrome-extensions claude-ai claude-api content-safety claude-3-sonnet

Updated Nov 30, 2024
TypeScript

cristofima / Demo-AzureAIContentSafety

Star

Content moderation (text and image) in a social network demo

angular dot-net azure-storage content-moderation content-safety

Updated Jan 5, 2025
C#

melroyanthony / llm-guardrails

Star

Responsible AI toolkit for LLM applications: PII/PHI redaction, prompt injection detection, bias scoring, content safety filters, and output validation. Framework-agnostic Python library with FastAPI demo.

python compliance hipaa gdpr bias-detection fastapi guardrails responsible-ai pii-redaction llm prompt-injection content-safety

Updated Feb 17, 2026
Python

ijerrywong / safe-share

Star

Pre-Publish Security Gate - Scan and redact sensitive information before sharing

python redaction security automation privacy ocr skill pii content-safety

Updated Mar 19, 2026
Python

AUTHENSOR / AUTHENSOR

Star

The open-source safety stack for AI agents. Policy engine, content scanner, approval workflows, audit trails. 924+ tests. MIT licensed.

Updated Mar 28, 2026
TypeScript

DeekeScript / ad-douyin-report

Star

抖音视频审核检测|同行举报分析工具|抖音视频风控|抖音风控||优化视频|举报同行|视频监测|视频检测

compliance douyin risk-control content-moderation competitor-analysis video-audit content-safety

Updated Apr 21, 2026
Python

Improve this page

Add a description, image, and links to the content-safety topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the content-safety topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

content-safety

Here are 44 public repositories matching this topic...

open-bias / open-bias

roostorg / coop

Ratila1 / JGuardrails

withinJoel / Content-Moderation-Engine

cristofima / TaskAgent-AgenticAI

Azure-Samples / rai-content-safety-workshop

RAGulatorAPP / RAGulator

RafaelParonis / jailbench

Dandona100 / SafeEyes

vibheksoni / jailbench

inferwall / inferwall

sammydeprez / presentations

Napiersnotes / AlignmentVirusV6

Moshe-ship / sarih

OrenGrinker / contentSafetyFilter

cristofima / Demo-AzureAIContentSafety

melroyanthony / llm-guardrails

ijerrywong / safe-share

AUTHENSOR / AUTHENSOR

DeekeScript / ad-douyin-report

Improve this page

Add this topic to your repo