fix: PremortemTask — minimal schema, IDs assigned in code#181
Open
82deutschmark wants to merge 1 commit intoPlanExeOrg:mainfrom
Open
fix: PremortemTask — minimal schema, IDs assigned in code#18182deutschmark wants to merge 1 commit intoPlanExeOrg:mainfrom
82deutschmark wants to merge 1 commit intoPlanExeOrg:mainfrom
Conversation
d86ec34 to
56838e7
Compare
Member
|
One of the fragile issues I strugged with.. the LLMs often does weird things with the assumption_id. user_prompt_list = [
user_prompt,
"Generate 3 new assumptions that are thematically different from the previous ones. Start assumption_id at A4.",
"Generate 3 new assumptions that are thematically different from the previous ones and covers different archetypes. Start assumption_id at A7.",
]Replacing it with The assumptions may reference prior assumptions, so assumption A5 may reference A4, so it's important that they are enumerated correctly. |
…ibility PremortemAnalysis required the LLM to emit a deeply nested schema in one call: 3 AssumptionItem + 3 FailureModeItem, 11+ required fields each, with linked cross-reference IDs. Local small models (Qwen 3.5-35B, GLM 4.7 Flash) echoed the schema structure back instead of producing values, exhausting all retries. Fix: decompose into one independent LLM call per archetype using ArchetypeNarrative (6 plain text fields, no IDs). Code assembles AssumptionItem + FailureModeItem from the narrative and assigns all IDs and cross-references. Changes (3 hunks): 1. Add ArchetypeNarrative schema (after PremortemAnalysis). Includes an 'archetype' field so the LLM can adapt the category name to the specific project rather than being locked to hardcoded labels. 2. Rewrite execute() to run num_rounds × 3 archetypes (3×3=9 calls in ALL_DETAILS, 1×3=3 in FAST mode), restoring the original 9+9 assumption/failure-mode volume. Archetype suggestions guide the LLM; the returned narrative.archetype is used in the output (LLM may rename/adapt it per project). Failed archetype calls are skipped gracefully; first call failure raises. 3. Fix _calculate_risk_level_verbose: return 'Not Scored' when likelihood or impact is None (was rendering 'Likelihood None/5, Impact None/5'). Validated: PremortemTask PASSED on GLM 4.7 Flash (HVT_minimal run).
56838e7 to
d9c1feb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Local models (Qwen, GLM) failed PremortemTask by echoing schema instructions instead of JSON. Root cause: schema required counting, ID cross-linking, and 11-field completeness in one call.
Fix: replace complex schema with 5-field ArchetypeNarrative (narrative only). Program assigns assumption_id, failure_mode_index, root_cause_assumption_id. Per-archetype decomposition with up to 5 retries; failed archetypes skipped gracefully.
Only premortem.py changed. Confirmed passing on GLM 4.7 Flash (local).