Development roadmap for the SLM-as-Cerebellum policy enforcement system.
Phase 1 Complete - Policy Oracle implemented with full CLI
| Component | Status | Notes |
|---|---|---|
Policy Oracle (Rust) |
COMPLETE |
Deterministic rule checking working |
CLI Tool |
COMPLETE |
scan, check, validate, init, completions |
Nickel Configuration |
COMPLETE |
Type-safe policy schema |
Training Data Structure |
COMPLETE |
compliant/violations/edge_cases |
SLM Evaluator |
PLACEHOLDER |
Interface defined, needs llama.cpp |
Consensus Arbiter |
STARTED |
GenServer skeleton, Application module, decide/3 logic |
LLM Integration |
NOT STARTED |
Depends on deployment context |
-
Core data types (Proposal, PolicyVerdict, ViolationType)
-
Language tier system (Tier 1/Tier 2/Forbidden)
-
Exception rules (Python in salt/, training/)
-
Toolchain rules (npm requires deno.json)
-
Forbidden pattern detection (hardcoded secrets)
-
Directory scanning with intelligent filtering
-
CLI with multiple output formats (text, json, compact)
-
Shell completion generation (bash, zsh, fish)
-
Man page generation
-
Nickel policy configuration
-
Unit tests for core functionality
conative scan <path> # Scan directory tree
conative check --file <path> # Check single file
conative check --content <str> # Check inline content
conative policy # Display policy
conative validate <proposal> # Validate JSON proposal
conative init # Initialize .conative/
conative completions <shell> # Generate completions
conative man # Generate man page-
Define SlmEvaluation struct and interface
-
Integrate llama.cpp Rust bindings
-
Load GGUF model files
-
Implement prompt engineering for policy detection
-
Add confidence scoring
-
Benchmark latency on target hardware
-
Create test suite for spirit violations
| Model | Parameters | Use Case |
|---|---|---|
Phi-3-mini |
3.8B |
Primary candidate - good balance of speed/quality |
Gemma-2B |
2B |
Faster, lower quality - mobile/edge |
Phi-3-small |
7B |
Higher quality - desktop/server |
-
Expand training data (target: 1000+ examples)
-
Balance dataset (violations vs compliant)
-
Implement QLoRA fine-tuning pipeline
-
Create validation holdout set
-
Implement weighted loss function:
-
violation_detected: 2.0
-
violation_missed: 3.0
-
false_positive: 0.5
-
-
Evaluate precision/recall tradeoffs
-
Export to GGUF format
-
Set up Elixir/OTP project structure
-
Implement GenServer for arbiter
-
Define consensus protocol messages
-
Implement asymmetric weighting (SLM = 1.5x)
-
Add escalation logic
-
Implement audit logging
-
Create supervision tree
-
Add Rustler NIFs for Oracle/SLM calls
-
Write property-based tests
-
Define hook interface
-
Implement pre-commit hook
-
Create real-time evaluation mode
-
Add Claude Code config schema
-
Write integration tests
-
Port to mobile-compatible format
-
Optimize for Oppo Reno 13 / edge devices
-
Integrate with reservoir computing layer
-
Add feedback loop for continuous learning
| Question | Approach |
|---|---|
Optimal SLM size |
Benchmark Phi-3-mini vs Gemma-2B vs Phi-3-small |
Training data volume |
Learning curves with 100/500/1000 examples |
Asymmetry calibration |
Vary weight 1.0-2.0, measure precision/recall |
Spirit detection accuracy |
Human evaluation of SLM judgments |
Latency budget |
Profile end-to-end decision time |
Adversarial robustness |
Red team: can LLM fool SLM? |
Cross-project generalization |
Test SLM trained on one policy against others |
-
Violation Detection Accuracy
-
500 proposals (250 violations, 250 compliant)
-
Baseline: GPT-4 with policy in context
-
Test: Fine-tuned Phi-3-mini
-
-
Latency Impact
-
Measure with/without SLM gating
-
Target: <500ms for interactive use
-
-
Weight Optimization
-
Pareto frontier: false positives vs missed violations
-
Alternative to Elixir for parallel model evaluation:
cobegin {
var llmResult = evaluateLLM(proposal);
var slmResult = evaluateSLM(proposal, policy);
var oracleResult = evaluateOracle(proposal, policy);
}Provable policy checking when Axiom.jl is ready:
@axiom PolicyCompliance begin
@ensure !contains_forbidden_language(proposal, policy)
@prove compliant(proposal) ∨ violation_reported(proposal)
end