Skip to content

sagm07/bloop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# Bloop 🔍

### The Ruthless ML Auditor Agent

Bloop is a git-native AI agent built on the gitagent standard.

You give it a broken ML model. It tells you exactly where it fails,

why it fails, and what to do about it — in that order.

No sugarcoating. No generic advice. Just answers.

---

## The Problem Bloop Solves

Every ML engineer has been here:

- Model is stuck at 87% accuracy

- You don't know if it's your data, your features, or your model

- You waste days trying random fixes

Bloop runs a structured 3-step audit and gives you a ranked action plan in seconds.

---

## How It Works

Bloop uses 3 skills in sequence:

### 1. Segment Analysis

Finds exactly where the model fails — which classes, slices, or cohorts

have the worst performance. Surfaces critical failures (F1 < 0.5).

### 2. Root Cause Analysis

Diagnoses why it fails — class imbalance, label noise, feature leakage,

or distribution shift. Always backed by evidence.

### 3. Fix Generator

Produces a ranked action plan — each fix linked to a root cause,

with expected accuracy gain and effort level.

---

## Demo

Set your Groq API key:


$env:GROQ\_API\_KEY="your-key"

Run Bloop:


node run.mjs

### Sample Output


Segment Analysis

\- Class 1 (positive): F1 = 0.60 — critical failure

\- Low lighting images: F1 = 0.45

\- Retinal hemorrhages: F1 = 0.48



Root Causes

1\. Class imbalance — 80% negative, 20% positive

2\. Label noise — low confidence scores in Class 1

3\. Distribution shift — train vs validation gap



Fix Plan

1\. Oversample with SMOTE → +2-3% accuracy, effort 6/10

2\. Label smoothing → +1-2% accuracy, effort 5/10

3\. Collect balanced data → +2-4% accuracy, effort 8/10

---

## Agent Structure


bloop/

├── agent.yaml              # Agent manifest

├── SOUL.md                 # Ruthless ML auditor personality  

├── RULES.md                # Hard constraints — never hallucinate metrics

├── run.mjs                 # Entry point

└── skills/

&#x20;   ├── segment-analysis/   # Where does it fail?

&#x20;   ├── root-cause/         # Why does it fail?

&#x20;   └── fix-generator/      # How do we fix it?

---

## Built With

- gitagent standard — git-native agent definition

- gitclaw — agent runtime

- Groq — llama-3.3-70b-versatile

- Node.js

---

## Real World Impact

Bloop was built to solve a real problem — a diabetic retinopathy

detection XGBoost model plateaued at 87% accuracy. Bloop diagnosed

the exact issue in seconds: class imbalance in positive cases,

label noise in low-lighting images, and distribution shift.

Everyone trains models. Almost no one does deep failure analysis.

Bloop does.

About

ML failure analysis agent

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors