FIBO PROGRESS 2 SHOWCASE > Next-Gen Agentic Orchestration for Tabletop RPGs
This project features a Multi-Agent Hybrid Architecture designed to solve the "AI Hallucination" problem in TTRPGs. By implementing a Two-Path Orchestrator, the system autonomously routes player intent between a Deterministic Python Rules Engine (for mechanical precision) and an LLM-based Creative Arbiter (for improvisational logic). Through a rigorous Multi-Agent Handshake, the engine ensures that all game state mutations are grounded in hard-coded logic while maintaining the narrative flexibility of Large Language Models.
- βοΈ Two-Path Architecture: Automatically routes player intent to either the Rules Engine (for standard actions) or the LLM Arbiter (for creative improv).
- π² Stateless Rules Engine: Deterministic Python logic for initiative, dice rolling, range checks, and damage calculation.
- π€ LLM Arbiter: A "Referee" AI that judges creative actions, assigns Difficulty Classes (DC), and applies symbolic status effects (e.g.,
STUNNED,RESTRAINED). - π Tactical Zone Combat: Grid-less tactical movement using
NEAR,MID, andFARzones with range-based disadvantage. - π Initiative Queue: A dynamic turn-order system where every character (Player & Enemy) rolls initiative at the start of combat.
- π§ Contextual Short-Term Memory: Utilizes an efficient
$\mathcal{O}(1)$ collections.dequesliding window to inject recent gameplay events (max 10) directly into the Arbiter and Narrator LLM prompts, ensuring contextual continuity without wasting API tokens on the stateless routing layer. - π Continuous Campaign Record: Background process that permanently logs an irreversible, real-time transcript of player inputs, DM generations, and hidden internal Python math
[SYSTEM]checkpoints to a.txtfile for future RAG summarization models. - β‘ Dynamic Sequence Actions: Supports complex commands like "I shoot then run away" or "I run in then attack", executing them in the order specified by the user.
- π Inventory Engine: Auto-looting, dynamic disposable items, and rigorous LLM-categorized consumable mechanics.
- π£οΈ Narrative State Transitions: Talk your way out of fights with Diplomacy (
PACIFIEDstate), or use tactical math-based fleeing mechanics. - π‘ QoL Features: "Idiot-proof" automated API key wizard and developer debug toggles to expose underlying AI processing.
For the complete mechanical breakdown of the FIBO Lite 5th Edition system, check out the official rulebook:
Prerequisites:
- Python 3.10+
- Gemini API Key (set in
.envasGOOGLE_API_KEY)
Launch the Demo:
# Initialize environment (Example using Conda)
conda activate ai_dm_core
# Install dependencies
pip install -r requirements.txt
# Run the dedicated launcher
python demo_day.py-
Clone the Repository
git clone [repository_url] cd AI_Dungeon_Master -
Install Dependencies
pip install -r requirements.txt
-
Get a Free API Key
- The game requires a Google Gemini API Key to run.
- Get your free key from Google AI Studio here.
-
Setup the Environment (Pick ONE method)
- The Automatic Way: Just run the game! (
python main.py). The system will detect that you are missing a key, pause the game, and prompt you to paste it in the terminal. It will then automatically create the.envfile for you. - The Manual Way: Create a new file named
.envin the root folder of this project and paste your key inside like this:GEMINI_API_KEY=your_key_here_xyz123
- The Automatic Way: Just run the game! (
-
Configure Settings (Optional)
- You can tweak the engine's behavior without touching Python code!
- Open
data/settings.jsonto configure:- Memory Queue Size (
max_history_events) - Default Difficulty Classes (
default_dc) - AI Creativity (
arbiter_temperature,narrator_temperature) - Target Fuzzy Matching strictness (
fuzzy_match_cutoff).
- Memory Queue Size (
- Note: If you delete this file, the engine will safely regenerate it with default values.
-
Run the Game
python main.py
AI_Dungeon_Master/
βββ docs/ # [THESIS] Final 2026 graduation thesis reports (WIP)
βββ archive/ # [HISTORY] Past iterations and research
β βββ phase2_demo/ # Old FIBO lab scripts and demo JSONs
β βββ references/ # Academic research papers and references
β βββ old_docs/ # Past presentations and progress reports
βββ main.py # [ENTRY] Full game entry point
βββ LITE_5E_RULES.md # [RULES] The formal "Lite 5e" rulebook for the AI and Player
βββ ARCHITECTURE.md # [DOCS] High-level system design overview
βββ requirements.txt # [DEPS] Project dependencies
βββ src/
β βββ agents/ # [AGENTS] Specialized LLM agents (post-refactor)
β β βββ base.py # BaseLLMProvider β shared API setup & model init
β β βββ arbiter_agent.py # ArbiterAgent β action validation & item categorization
β β βββ narrator_agent.py# NarratorAgent β combat narration & outcome description
β β βββ campaign_agent.py# CampaignAgent β recap & cold-open prologue generation
β β βββ character_agent.py# CharacterAgent β Zero-Hallucination character & world lore
β βββ engine/ # [ORCHESTRATOR] Pre-game flow & main combat loop
β β βββ startup.py # Pre-game flow: character creation, lore, prologue
β β βββ game_loop.py # Handles turn queue and execution flow
β βββ ui/ # [CLI] Presentation layer
β β βββ character_sheet.py # Character Sheet & World Lore TUI renderers
β β βββ dashboard.py # Renders HP, ASCII targets, and zones
β β βββ menu.py # Main menu, recap menu
β βββ router/ # [THE BRAIN] Intent Classification & Action Logic
β β βββ intent_router.py # Two-path router (FIXED vs CREATIVE)
β β βββ intents.py # Action execution handlers (MOVE/ATTACK/USE)
β βββ logic/ # [CALCULATOR] Pure Python Mechanics
β β βββ rules_engine.py # Dice, DC checks, damage math
β β βββ combat_manager.py# Initiative queue
β β βββ enemy_ai.py # Enemy turn logic
β β βββ dice_roller.py # Dice rolling utilities
β β βββ abilities.py # Ability definitions
β βββ models/ # [STATE] Single Source of Truth
β β βββ character.py # Character dataclass (stats, lore, conditions)
β β βββ game_state.py # Global state container
β β βββ toon_converter.py# TOON serializer for minimal token usage
β βββ services/ # [IO] External APIs & Persistence
β βββ llm_service.py # Backward-compatible faΓ§ade over src/agents/
β βββ data_manager.py # JSON save/load system
β βββ rag_service.py # RAG/Context preparation
βββ data/
β βββ active/ # Live session data (written during gameplay)
β β βββ campaign_active.json # Current save state
β β βββ campaign_log.txt # Continuous transcript
β β βββ world_lore.txt # Active world context for the Narrator
β βββ config/ # Engine configuration (edited by user)
β β βββ settings.json # Editable engine parameters
β β βββ settings_backup.json # Safe default settings fallback
β β βββ bestiary.json # Enemy stat templates
β βββ premade/ # Hand-crafted selection templates
β βββ characters/ # Premade class JSON files (fighter, mage, rogueβ¦)
β βββ lore/ # Premade world lore .txt files
βββ archive/progress_2/ # Deprecated files from pre-Phase-3 (kept for history)
βββ evaluation/ # [QA] Evaluation & regression suite
β βββ combat/
β βββ evaluation_runner.py # 50-scenario regression runner
β βββ scenario_suite.json # Structured test scenarios
β βββ results/ # Auto-generated trace logs and metrics CSV
βββ tests/ # [QA] Unit tests
βββ test_rules.py # RulesEngine pytest coverage
βββ test_persistence.py # DataManager save/load parity
βββ test_*.py # Other scenario and module tests
The system is built as a "Stateless Symbolic Machine" to ensure 100% mechanical consistency.
- Rule-First Decisioning: If an action matches a standard game mechanic (Attack, Move, Item), the AI is bypassed for the calculation. The Python engine handles the math.
- Symbolic Grounding: When the AI allows a creative action (e.g., "I pull the rug"), it must return a Symbolic Side Effect (e.g.,
target_condition: PRONE). The Python engine then applies this to the live model. - TOON Serialization: Uses a custom compact format for game state to reduce LLM token usage by up to 50%, ensuring faster response times and lower costs.
πΎ Data & State Management
- External JSON State Persistence: Game state (Party & Enemies) is loaded and saved dynamically via
data/campaign.jsonusing theDataManager, avoiding hardcoded stats. - Bidirectional TOON Serialization (Token-Oriented Object Notation): A custom serialization pipeline (
TOONConverter) drastically reduces API overhead. State is compressed into TOON before sending to the Arbiter. Furthermore, all LLM API outputs (including Intent Routers and Arbiter logic) are explicitly forced via system prompts to return 1-line TOON (key:value|key:value), completely eliminating verbose JSON outputs. This achieves ~50% total token reduction and lower latency. - Decoupled Health vs. Tactical Conditions: The
Charactermodel strictly separates Health Status (e.g., Unscathed, Bloodied, Critical) from Tactical Conditions (Enum:NORMAL,STUNNED,BLINDED), ensuring narrative damage doesn't overwrite mechanical penalties. - Dual-Stream Memory & Context Collapse (State-Dependent Pruning): Upgraded the sliding window into a two-tier system:
combat_memory(maxlen=10) andstory_memory(maxlen=5), both fully configurable viasettings.json. Implemented an automatic interception hook onVICTORYorFLEEstates that feeds the raw combat logs to an LLM Summarizer. The AI compresses the mathematical battle into a single narrative sentence, pushes it tostory_memory, and flushes the combat queue, preventing token bloat. - Static Lore Injection (Static RAG): Implemented a fail-safe retrieval system that loads world-building context from
data/world_lore.txt. This allows for instant "World Flavor" shifts without code changes. The engine handles missing lore files with a hardcoded fallback. - Robust Backend Configuration (
settings.json): Abstracted hardcoded variables into an open-source-friendly JSON config file. Exposes core engine parameters and LLM API settings. Built a bulletproof boot sequence inDataManagerwith Python fallbacks to repair missing config files. - Persistent Campaign Journal (System Log): Implemented an append-only logging system that permanently records every player input and AI output in real-time to a local file. Includes an auto-reset mechanism triggered during new game boots, creating a parseable script for future save/load summarization.
π§ The Intent Router (Two-Path Architecture)
- Path A (Fixed Rules Routing): Standard RPG mechanics (Attack, Move) bypass the LLM for calculation, sending the action directly to the Python
RulesEngineto mathematically guarantee zero AI hallucinations. - Path B (Creative Improv Routing): Complex user prompts are intelligently routed to the
LLMService(Arbiter), which judges logical feasibility, assigns a DC (Difficulty Class), and automatically outputs a Symbolic Side Effect (e.g.,BLINDED). - Dynamic Action Sequencing (
FIXED_COMBO): The LLM parses multi-step user intents and extracts anaction_orderarray. The game engine dynamically executes the sequence exactly as the user typed it. - Action Fairness & Multi-Agent Economy Guard: The engine strictly enforces a 5e-style action economy by tracking
has_actedandhas_movedflags on theCharactermodel, refreshed viareset_turn(). If an Arbiter denies a creative request, the turn is refunded. However, invalid mechanical requests (e.g., double-attacks, out-of-range moves) are deterministically denied and the turn consumed, ensuring the AI cannot be exploited or cheat.
βοΈ Combat & Rules Engine
- Stateless Rules Engine (
RulesEngine): Pure Python math logic handles all 1d20 dice rolls, AC (Armor Class) checks, Critical Hit doubling logic, and Stat modifiers (PHYS/MENT/SOC). [Newly Expanded] Integratedresolve_spellwhich utilizes the MENT stat and a native 1d10 damage system, andresolve_itemwhich wraps consumable logic into standardized dictionary outputs for perfect evaluation tracing. - Individual Rolling Initiative Queue: Upgraded from legacy "Side vs. Side" turns to a granular, individual turn order. Every combatant rolls 1d20 + PHYS at the start of combat. The loop acts seamlessly, prompting players, triggering EnemyAI, bypassing DEAD characters, and incrementing rounds.
- 3-Tier Intelligent Target Selection: LLM ID Extraction identifies the exact hidden ID (Tier 1). Fuzzy Spell Matching utilizes
difflib.get_close_matchesto catch typos (Tier 2). Auto-Fallback defaults to the first active enemy to prevent wasted inputs (Tier 3). - Enemy AI Tactics (
EnemyAI): A lightweight AI that targets the nearest valid opponent and executes a single turn, narrating the sequence automatically on its turn. - Mechanical Status Effects: Conditions have actual engine consequences.
STUNNEDcharacters forfeit their turn, whileBLINDEDtriggers disadvantage mechanics inside theRulesEngine.
π Spatial & Movement Mechanics (Zones)
- Tactical Zone Tracking: Grid-less combat utilizing distinct range zones (
NEAR,MID,FAR). - Range Penalties: Using melee weapons outside
NEARrange automatically triggers "Out of Range" failures, forcing tactical positioning. - Movement Enforcement (1-Zone Rule): The engine prevents teleportation, restricting movement to exactly 1 adjacent zone per turn and resolving incorrect distance requests.
- AI Gap-Closing: Melee-equipped enemies are programmed to automatically spend their turn moving one zone closer if they are out of range of the player.
π₯οΈ UI & Narrative Generation
- LLM Generative Narration: The system translates raw, calculated Python logs into immersive, D&D-style second-person narration.
- Immersive CLI Dashboard: A cleanly formatted terminal UI that hides raw enemy HP numbers to prevent metagaming, displaying visual health descriptors and exact player stats instead.
- Developer Debug Mode: A toggle command that exposes the raw LLM JSON outputs, parsed intents, and true Python math logs to prove the system works.
- Demo Day Launcher (
demo_day.py): A dedicated, crash-resistant script with ASCII art, an interactive command loop, and an auto-reset function (restart) to cleanly restore state. - "Idiot-Proof" Onboarding (QoL): An automated boot sequence that intercepts missing
GEMINI_API_KEYerrors, prompts the user via CLI, and generates the.envfile to prevent crashes.
π Item & Inventory Mechanics
- Symbolic Disposable Items (Path B): Complex narrative item usage is routed to the Arbiter. If the AI determines the item is destroyed, it returns a
consumed_itemkey, prompting the engine to.pop()it from the inventory. - Automated Victory Looting (Auto-Loot): Upon triggering the
VICTORYstate, the engine extracts item strings from defeated enemies, transfers them to the player, empties enemy pockets, saves the game, and prints a formatted UI summary. - Hardcoded Consumable Logic (Path A - Mechanics): Standard items bypass the LLM entirely to guarantee zero hallucinations via an
ITEM_EFFECTSdictionary.HEALitems restore HP,CUREitems revert status conditions, andDAMAGEitems apply fixed mechanical damage. [Enhanced Security] TheUSEcommand performs a secure inventory verification, asserting the item's existence inplayer.inventorybefore executing.remove(), neutralizing hallucinated item usage.
π£οΈ Narrative State Transitions
- Diplomacy & Pacification (Path B): Players can dynamically talk their way out of fights. The LLM Arbiter can assign a
PACIFIEDstatus. The EnemyAI recognizes this, forfeits its turn, and the game loop correctly counts them as "defeated" to trigger aVICTORY. - Tactical Fleeing Mechanics (Path A): The Intent Router parses "flee" commands as FIXED actions. The
RulesEngineresolves a contested 1d20 + PHYS check. Enemies receive a Proximity Penalty bonus to their roll based on Zone distance (+5 if Same Zone, +2 if Adjacent). A successful player roll exits combat, while failures rightfully consume the turn.
π§ͺ Automated Evaluation & Verification
- 50-Scenario Functional Stress Test: Developed a comprehensive
evaluation_runner.pythat utilizesunittest.mockto patch internal engine methods and capture a deep-divetrace_log.json. - Grounded Result Metrics: Following the implementation of Permission Guardrails and expanded resolvers, the system achieved a 100% Grounding Precision (
$P_{ground}$ ) and 76% State Synchronization ($S_{sync}$ ), proving the multi-agent "Handshake" is mathematically reliable.
π§ Future Work / Missing Features
- Narrative State Transitions:
- World Exploration Mode: Disabling Initiative and transitioning to a free-form RAG exploration state.
- Lore Expansion: Populating
world_lore.txtwith more complex situational data to further ground the AI's creative narration. - Optional Story Summarizer (Load Feature): Utilizing the newly built Campaign Log to let the LLM generate a "Previously on..." summary when players load a saved game.
This project is built upon the foundational research of AI-assisted narrative generation and LLM-based agent architecture. We would like to acknowledge the following papers for their inspiration on our hybrid system:
- π JΓΈrgensen et al. (2024) β ChatRPG: A Multi-Agent "ReAct" Game Master
- π Sakellaridis (2024) β LLM-Based Agent as Dungeon Master
- π Song et al. (2024) β Tool-Assisted AI DM: Function Calling & External Tools
Note
Full academic PDFs can be found in the <samp>archive/references/</samp> directory.