🛡️ Assured Sentinel

Probabilistic Guardrails for AI-Generated Code
Bound unsafe code acceptance rates with statistical guarantees using Conformal Prediction.

📚 Strategic Context

For reviewers evaluating architectural decision-making and research depth:

Document	Description
Decision Log	Comprehensive record of technical and product decisions with alternatives considered, rationale, and trade-offs
Deep Research Analysis	Foundational research on Conformal Prediction, SAST tools, and statistical guarantees for AI safety
Architecture	System design, component interactions, scalability constraints, and extensibility patterns
Risk Register	Security, operational, and theoretical risks with mitigations
Product Roadmap	Phased development plan from MVP through enterprise scale
CHANGELOG	Version history and migration guides

📋 Executive Summary


Who	Security teams & platform engineers shipping AI coding assistants
Problem	LLM-generated code can be unsafe; static rules alone are brittle; teams need calibrated risk control
Solution	Deterministic SAST risk score (Bandit) + conformal calibration threshold + accept/reject gate
Guarantee	Bounded unsafe code acceptance rate ≤ α with statistical validity

Key Metrics (α = 0.10 on MBPP Baseline)

Metric	Value
Acceptance Rate	80%
Scanner-Flagged Accept Bound	≤10%
False Reject Rate	~20%
Median Latency	<500ms
Cost per Eval	$0.00 (local Bandit)

🚀 Quick Start

Installation

# Clone & Install
git clone https://github.com/LEDazzio01/Assured-Sentinel.git && cd Assured-Sentinel
pip install -e ".[dev]"

# Run Demo (works offline - no API key needed)
sentinel demo

As a Library

from assured_sentinel import Commander, BanditScorer

# Initialize with calibrated threshold
commander = Commander()

# Verify code snippet
result = commander.verify("print('Hello, World!')")
print(f"Status: {result.status}, Score: {result.score}")
# Status: PASS, Score: 0.0

# Dangerous code
result = commander.verify("exec(user_input)")
print(f"Status: {result.status}, Score: {result.score}")
# Status: REJECT, Score: 0.5

Expected Demo Output

=== ASSURED SENTINEL - OFFLINE DEMO ===
📝 Testing: exec(user_input)
🔍 Bandit Score: 0.5 (MEDIUM severity)
🚫 Decision: REJECT (Score 0.5 > Threshold 0.15)

📝 Testing: def factorial(n): return 1 if n <= 1 else n * factorial(n-1)
🔍 Bandit Score: 0.0 (Clean)
✅ Decision: PASS (Score 0.0 <= Threshold 0.15)

🏛️ Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   User      │────▶│   Analyst   │────▶│  Commander  │────▶ Accept/Reject
│   Query     │     │  (LLM Gen)  │     │  (Guardrail)│
└─────────────┘     └─────────────┘     └─────────────┘
                           │                   │
                           ▼                   ▼
                    Azure OpenAI         ┌───────────┐
                    (gpt-4o, temp=0.8)   │  Scorer   │
                                         │ (Bandit)  │
                                         └─────┬─────┘
                                               │
                                               ▼
                                         Calibrated
                                         Threshold (q̂)

Two-Agent Pattern

Agent	Role	Behavior
Analyst	Generator	High-temperature (0.8) LLM for creative code proposals
Commander	Guardrail	Deterministic verification against calibrated threshold

Why This Design?

Decoupled generation from verification — prevents self-delusion in single-agent loops
Deterministic scoring — reproducible, auditable security decisions
Fail-closed by default — unparseable code is rejected, not passed
SOLID principles — extensible, testable, maintainable

📂 Project Structure

Assured-Sentinel/
├── assured_sentinel/           # Main package
│   ├── __init__.py            # Package exports
│   ├── config.py              # Pydantic Settings (centralized config)
│   ├── models.py              # Pydantic models (DTOs)
│   ├── protocols.py           # Protocol interfaces (ISP/DIP)
│   ├── exceptions.py          # Custom exception hierarchy
│   ├── core/                  # Core business logic
│   │   ├── scorer.py          # BanditScorer (IScoringService)
│   │   ├── commander.py       # Commander (IVerifier)
│   │   └── calibrator.py      # ConformalCalibrator
│   ├── agents/                # LLM integration
│   │   └── analyst.py         # AzureAnalyst (ICodeGenerator)
│   ├── cli/                   # Command-line interface
│   │   └── main.py            # CLI entry points
│   └── dashboard/             # Streamlit UI
│       └── app.py             # Dashboard application
├── tests/                     # Comprehensive test suite
│   ├── conftest.py            # Shared fixtures
│   ├── unit/                  # Unit tests
│   │   ├── test_scorer.py
│   │   ├── test_commander.py
│   │   ├── test_calibrator.py
│   │   ├── test_analyst.py
│   │   ├── test_models.py
│   │   └── test_exceptions.py
│   └── integration/           # Integration tests
│       └── test_full_flow.py
├── docs/                      # Documentation
│   ├── Architecture.md
│   ├── Decision-log.md
│   ├── PRD.md
│   ├── Risks.md
│   └── Roadmap.md
├── pyproject.toml             # Project configuration
├── requirements.txt           # Dependencies
├── Makefile                   # Development shortcuts
├── CHANGELOG.md               # Version history
└── README.md                  # This file

🖥️ CLI Usage

# Verify a code snippet
sentinel verify "print('hello')"

# Verify from file
sentinel verify --file script.py

# Override threshold
sentinel verify --threshold 0.05 "eval(input())"

# Output as JSON (for CI/CD integration)
sentinel verify --json "exec(x)"

# Scan a directory
sentinel scan ./src --recursive

# Run calibration
sentinel calibrate --alpha 0.1 --samples 100

# Run demo
sentinel demo

# Run LLM correction loop (requires Azure OpenAI)
sentinel run "Write a function to calculate factorial"

⚙️ Configuration

Environment Variables

All settings can be configured via environment variables with the SENTINEL_ prefix:

# Core settings
SENTINEL_ALPHA=0.1                    # Risk tolerance
SENTINEL_DEFAULT_THRESHOLD=0.15       # Fallback threshold
SENTINEL_LOG_LEVEL=INFO               # Logging level

# Azure OpenAI (optional - for Analyst)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-key-here
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o

Programmatic Configuration

from assured_sentinel import Settings, Commander
from assured_sentinel.core.scorer import BanditScorer
from assured_sentinel.models import ScoringConfig

# Custom scoring configuration
scoring_config = ScoringConfig(
    timeout_seconds=60,
    fail_closed=True,
    use_ramdisk=True,  # Performance optimization
)
scorer = BanditScorer(config=scoring_config)

# Custom commander
commander = Commander(scorer=scorer)
commander.threshold = 0.2  # Override threshold

🧪 Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=assured_sentinel --cov-report=term-missing --cov-report=html

# Run only unit tests
pytest tests/unit/ -v

# Run only integration tests
pytest tests/integration/ -v

# Type checking
mypy assured_sentinel/

# Linting
ruff check assured_sentinel/ tests/

📈 Evaluation Results

Acceptance Rate vs. Risk Tolerance (α)

α (Risk Tolerance)	Acceptance Rate	Unsafe Accept Bound	False Reject Rate
0.05	70%	≤5%	30%
0.10	80%	≤10%	20%
0.20	90%	≤20%	10%

Severity Breakdown

Severity	Score	Action
Clean	0.0	✅ Accept
Low	0.1	✅ Accept (below q̂)
Medium	0.5	🚫 Reject
High	1.0	🚫 Reject
Parse Error	1.0	🚫 Reject (fail-closed)

🔬 Theoretical Foundation

We implement Split Conformal Prediction (SCP) for distribution-free uncertainty quantification.

The Statistical Guarantee

$$P(Y_{n+1} \in C(X_{n+1})) \geq 1 - \alpha$$

Where:

$\alpha$ = risk tolerance (default: 0.10)
$C(X)$ = conformity set (accepted code with score ≤ q̂)
$Y_{n+1}$ = new code sample's scanner score

⚠️ What Is Guaranteed vs. Not Guaranteed

✅ Guaranteed	❌ Not Guaranteed
Statistical coverage on scanner-defined scores	Absence of all vulnerabilities
Bounded false acceptance rate w.r.t. Bandit findings	Robustness to distribution shift
Reproducible, deterministic decisions	Semantic security or logical correctness
Finite-sample validity	Protection against adversarial evasion

🚦 Design Principles

This project follows SOLID principles:

Principle	Implementation
Single Responsibility	Separate classes for scoring, verification, calibration
Open/Closed	Protocols allow new scorers without modifying Commander
Liskov Substitution	All scorers implement `IScoringService` protocol
Interface Segregation	Small, focused protocols (`IScoringService`, `IVerifier`, etc.)
Dependency Inversion	Commander depends on abstractions, not concrete scorers

📊 Roadmap

Phase 1: MVP ✅

Analyst/Commander two-agent pattern
Bandit-based deterministic scoring
MBPP calibration with synthetic injection
Streamlit dashboard
Correction loop (retry on rejection)

Phase 2: Refactoring ✅ (v2.0)

SOLID principles implementation
Protocol-based interfaces
Pydantic models and settings
Custom exception hierarchy
Comprehensive test suite (90%+ coverage target)
Dependency injection

Phase 3: Production Hardening (Planned)

Multi-signal scoring (Semgrep, secret scanning)
CI/CD integration (GitHub Actions, Azure DevOps)
Drift monitoring & auto-recalibration
API endpoint for enterprise integration

See Roadmap for detailed milestones.

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Run tests (pytest tests/ -v)
Run type checks (mypy assured_sentinel/)
Commit your changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Setup

# Install with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks (recommended)
pre-commit install

📖 References

Vovk, V., Gammerman, A., & Shafer, G. (2005). Algorithmic Learning in a Random World
Manokhin, V. (2022). Practical Applied Conformal Prediction
Bandit Documentation
MBPP Dataset

📄 License

MIT License — see LICENSE for details.

Assured Sentinel v2.0
Deterministic safety for stochastic systems.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
assured_sentinel		assured_sentinel
docs		docs
tests		tests
.env.example		.env.example
.gitignore		.gitignore
20251123_Assured_Sentinel_Deep_Research.pdf		20251123_Assured_Sentinel_Deep_Research.pdf
20251123_Assured_Sentinel_Project_Plan.pdf		20251123_Assured_Sentinel_Project_Plan.pdf
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
analyst.py		analyst.py
calibration.py		calibration.py
calibration_data.json		calibration_data.json
calibration_data.pkl		calibration_data.pkl
commander.py		commander.py
dashboard.py		dashboard.py
demo.py		demo.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
scorer.py		scorer.py
sentinel.py		sentinel.py

License

LEDazzio01/Assured-Sentinel

Folders and files

Latest commit

History

Repository files navigation