experiment(gra-217): tighten graduation dedup threshold 0.85→0.78#220
experiment(gra-217): tighten graduation dedup threshold 0.85→0.78#220Gradata wants to merge 1 commit into
Conversation
Lower _GRADUATION_DEDUP_THRESHOLD to catch more near-duplicate corrections at graduation time, targeting higher noise_correction_filter_rate (GRA-217). Window: 7d. Measurement: filter rate via graduation diagnostics. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
🧰 Additional context used📓 Path-based instructions (1)Gradata/src/**/*.py📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Files:
🔇 Additional comments (1)
📝 Walkthrough
WalkthroughThe near-duplicate detection threshold for graduation gating is tightened by lowering ChangesGraduation Dedup Threshold
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.21.0)OpenGrep fatal error (exit code 2): �[32m✔�[39m �[1mOpengrep OSS�[0m �[1m Loading rules from local config...�[0m Comment |
Summary
Lab experiment GRA-217:
noise_correction_filter_rateHypothesis: Tightening
_GRADUATION_DEDUP_THRESHOLDfrom 0.85 to 0.78 catches more near-duplicate corrections at graduation time without losing distinct signals — raising the noise correction filter rate.Change:
src/gradata/enhancements/self_improvement/_confidence.pyline 139_GRADUATION_DEDUP_THRESHOLD = 0.85_GRADUATION_DEDUP_THRESHOLD = 0.78Mechanism: In
_graduation.py, this threshold gates whether a candidate RULE lesson is blocked as a near-duplicate of an existing rule via semantic vector similarity. Lowering from 0.85 to 0.78 means lessons with ≥0.78 cosine similarity (vs prior ≥0.85) are now filtered — a wider net that should catch more redundant near-duplicates while still allowing genuinely distinct signals through.Metric direction: higher (more noise filtered = better)
Measurement window: 7 days
Lead: analyst (GRA-217)
Closes: GRA-217