Skip to content

Latest commit

 

History

History
306 lines (225 loc) · 6.9 KB

File metadata and controls

306 lines (225 loc) · 6.9 KB

Conative Gating Roadmap

Development roadmap for the SLM-as-Cerebellum policy enforcement system.

Current Status

Phase 1 Complete - Policy Oracle implemented with full CLI

Component Status Notes

Policy Oracle (Rust)

COMPLETE

Deterministic rule checking working

CLI Tool

COMPLETE

scan, check, validate, init, completions

Nickel Configuration

COMPLETE

Type-safe policy schema

Training Data Structure

COMPLETE

compliant/violations/edge_cases

SLM Evaluator

PLACEHOLDER

Interface defined, needs llama.cpp

Consensus Arbiter

STARTED

GenServer skeleton, Application module, decide/3 logic

LLM Integration

NOT STARTED

Depends on deployment context

Phase 1: Policy Oracle COMPLETE

Deliverables

  • Core data types (Proposal, PolicyVerdict, ViolationType)

  • Language tier system (Tier 1/Tier 2/Forbidden)

  • Exception rules (Python in salt/, training/)

  • Toolchain rules (npm requires deno.json)

  • Forbidden pattern detection (hardcoded secrets)

  • Directory scanning with intelligent filtering

  • CLI with multiple output formats (text, json, compact)

  • Shell completion generation (bash, zsh, fish)

  • Man page generation

  • Nickel policy configuration

  • Unit tests for core functionality

CLI Commands

conative scan <path>           # Scan directory tree
conative check --file <path>   # Check single file
conative check --content <str> # Check inline content
conative policy                # Display policy
conative validate <proposal>   # Validate JSON proposal
conative init                  # Initialize .conative/
conative completions <shell>   # Generate completions
conative man                   # Generate man page

Phase 2: SLM Evaluator IN PROGRESS

Goals

Implement neural "spirit violation" detection using a fine-tuned Small Language Model.

Tasks

  • Define SlmEvaluation struct and interface

  • Integrate llama.cpp Rust bindings

  • Load GGUF model files

  • Implement prompt engineering for policy detection

  • Add confidence scoring

  • Benchmark latency on target hardware

  • Create test suite for spirit violations

Model Selection

Model Parameters Use Case

Phi-3-mini

3.8B

Primary candidate - good balance of speed/quality

Gemma-2B

2B

Faster, lower quality - mobile/edge

Phi-3-small

7B

Higher quality - desktop/server

Spirit Violations to Detect

  • README bloat with meta-framework commentary

  • Verbosity smell (over-explanation)

  • Technically compliant but intent-violating code

  • "Helpful" additions that weren’t requested

  • Suspicious pattern deviations

Phase 3: Training Pipeline

Goals

Build dataset and fine-tune SLM for adversarial policy detection.

Tasks

  • Expand training data (target: 1000+ examples)

  • Balance dataset (violations vs compliant)

  • Implement QLoRA fine-tuning pipeline

  • Create validation holdout set

  • Implement weighted loss function:

    • violation_detected: 2.0

    • violation_missed: 3.0

    • false_positive: 0.5

  • Evaluate precision/recall tradeoffs

  • Export to GGUF format

Training Data Structure

training/
  compliant/         # Should pass - Tier 1 languages, proper toolchain
  violations/        # Hard violations - forbidden languages, secrets
  edge_cases/        # Spirit violations - needs SLM judgment

Phase 4: Consensus Arbiter

Goals

Implement modified PBFT consensus with asymmetric weighting in Elixir/OTP.

Tasks

  • Set up Elixir/OTP project structure

  • Implement GenServer for arbiter

  • Define consensus protocol messages

  • Implement asymmetric weighting (SLM = 1.5x)

  • Add escalation logic

  • Implement audit logging

  • Create supervision tree

  • Add Rustler NIFs for Oracle/SLM calls

  • Write property-based tests

Decision Thresholds

Threshold Default Purpose

slm_weight

1.5

SLM vote multiplier

escalate_threshold

0.4

When to ask human

block_threshold

0.7

When to reject outright

Phase 5: Integration

Claude Code Integration

  • Define hook interface

  • Implement pre-commit hook

  • Create real-time evaluation mode

  • Add Claude Code config schema

  • Write integration tests

NeuroPhone Integration

  • Port to mobile-compatible format

  • Optimize for Oppo Reno 13 / edge devices

  • Integrate with reservoir computing layer

  • Add feedback loop for continuous learning

ECHIDNA Integration

  • Define SLM as "prover" interface

  • Implement proof certificate generation

  • Add to multi-prover orchestration

Phase 6: Experimentation

Research Questions

Question Approach

Optimal SLM size

Benchmark Phi-3-mini vs Gemma-2B vs Phi-3-small

Training data volume

Learning curves with 100/500/1000 examples

Asymmetry calibration

Vary weight 1.0-2.0, measure precision/recall

Spirit detection accuracy

Human evaluation of SLM judgments

Latency budget

Profile end-to-end decision time

Adversarial robustness

Red team: can LLM fool SLM?

Cross-project generalization

Test SLM trained on one policy against others

Experiments

  1. Violation Detection Accuracy

    • 500 proposals (250 violations, 250 compliant)

    • Baseline: GPT-4 with policy in context

    • Test: Fine-tuned Phi-3-mini

  2. Latency Impact

    • Measure with/without SLM gating

    • Target: <500ms for interactive use

  3. Weight Optimization

    • Pareto frontier: false positives vs missed violations

Future Directions

Chapel Orchestrator (Optional)

Alternative to Elixir for parallel model evaluation:

cobegin {
    var llmResult = evaluateLLM(proposal);
    var slmResult = evaluateSLM(proposal, policy);
    var oracleResult = evaluateOracle(proposal, policy);
}

Axiom.jl Integration

Provable policy checking when Axiom.jl is ready:

@axiom PolicyCompliance begin
    @ensure !contains_forbidden_language(proposal, policy)
    @prove compliant(proposal)  violation_reported(proposal)
end

Continuous Learning

Feedback loop for ongoing improvement:

  • Record decision outcomes

  • Retrain SLM on misclassifications

  • Adapt thresholds based on project-specific data

Version History

Version Date Changes

0.1.0

2025-12

Initial release - Policy Oracle + CLI

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Priority areas for contribution:

  1. Training data examples (especially edge cases)

  2. SLM fine-tuning experiments

  3. Elixir/OTP Consensus Arbiter

  4. Integration hooks for other editors/tools