[Record Submission] SP8192 + QK5 + Legal TTT — val_bpb 1.0842 | 15.99MB#1476
[Record Submission] SP8192 + QK5 + Legal TTT — val_bpb 1.0842 | 15.99MB#1476aryan-cs wants to merge 2 commits intoopenai:mainfrom
Conversation
…#1476/openai#1477 confirm SP8192+TTT is new comp meta — our SP8192 build is ready, deploy next; LEGAL_TTT brittleness pattern confirmed n=2
…C FAIL Pattern across 5 stacked tests with LEGAL_TTT — ALL FAIL: - bare champion (gated + LEGAL_TTT only) = 1.3711 ★ - + 3-way (normuon + asym_skip) = 1.40695 (+0.036) - + L02_anti-curriculum (BEC_REVERSE) = 1.4112 (+0.041) - + L05_norm_pct_dropout = 1.41515 (+0.044) - + L09_NGRAM_BACKOFF (cross-layer logit residual) = 1.4567 (+0.086) ★ WORST CONFIRMED: LEGAL_TTT champion is a UNIQUE brittle local minimum. Even cross-layer additions (n-gram bias residual that touches LOGITS not WEIGHTS) catastrophically destroy it. The hypothesis that "cross-layer = orthogonal" is FALSE. LEGAL_TTT eval-time SGD seems to learn to overfit to the EXACT forward pass shape of the bare champion; ANY logit perturbation upstream breaks the per-batch context/target convergence. Implication: Path to bear-fruit (1.3666) requires either: 1. SP8192 + bare LEGAL_TTT (comp meta from PR openai#1476) — needs vocab swap 2. A new mechanism that REPLACES LEGAL_TTT entirely 3. Hyperparameter tuning of LEGAL_TTT itself (longer steps, different LR) Pod I's STACK_BACKOFF_GATED (no LEGAL_TTT) is the only remaining test that can validate cross-layer compositionality WITHOUT the brittle ingredient. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Community Review — [Record Submission] SP8192 + QK5 + Legal TTT — val_bpb 1.0842 | 15.99MBCompliance: NEEDS AUTHOR ACTION — What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with: A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:
Recommendation: Could you run Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step. Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — SyntaxError: f-string: expecting '}' (line 260). Classification via |
|
Thanks for the catch. Fixed in the latest push to the PR branch. What changed in
Verification:
GitHub now shows PR Please re-run the compliance audit when convenient. |
|
Re-audited at head SHA Gauntlet result (CT2038, Python 3.10, torch 2.10.0+cpu): The original import error (PEP 701 f-string) is fixed — the inner payload decompresses and imports cleanly under Python 3.10. The flash_attn guard also works. Compliance audit on the decompressed inner source (476 lines): The TTT implementation at lines 345-389 (
This is the canonical legal TTT shape from PR #1413 (dexhunter): score chunk under no_grad, adapt on same chunk, next chunk sees updated weights, last chunk scored but never adapted. No n-gram cache, no SLOT, no pre-quant TTT. Record-track checks from submission.json:
Verdict: LOOKS CLEAN on compliance. TTT is legal score-first per #1413. Recommendation to @cocohearts @valerio-oai @0hq @yuzhougu-oai @notapplica: Compliance is clean. One note: this is a single-seed submission — Thanks for the thorough fix documentation @aryan-cs. Re-audit by @MatoTeziTanka. CPU gauntlet on CT2038: IMPORT_OK, MODEL_OK, FORWARD_OK. Inner lzma payload decompressed (476 lines) and compliance-audited: legal score-first TTT (lines 360-368), no n-gram, no SLOT. submission.json verified: 15.97MB artifact, 1 seed. |
Record: SP8192 + QK-Gain 5 + Legal Score-First TTT
val_bpb: 1.0842 | 15.99 MB
This submission validates a strong configuration using the SP8192 tokenizer,
QK_GAIN_INIT=5, and legal score-first TTT.The PR contains the tracked submission package:
records/track_10min_16mb/2026-04-08_PR1413_SP8192_QK5_LegalTTT_1.0842Results
1.09054898legal_ttt_exact1.0841802115,968,912bytes15,987,393bytesMain Changes
This submission uses the following configuration:
QK_GAIN_INIT=5The resulting run achieves a strong exact score while staying within the tracked submission format.
Architecture
5track_10min_16mbfinal_model.int6.ptzSubmission Package
Included files:
README.mdsubmission.jsontrain_gpt.pytrain_and_exact_log.txtfinal_model.int6.ptzNotes
This PR submits the validated run above as a clean tracked record package.