D&D RAG

Ask natural language questions about D&D 5th Edition rules. Semantic search retrieves the most relevant passages from the Dungeon Master's Guide, Player's Handbook, and Monster Manual, then Claude AI synthesizes a cited answer grounded exclusively in the source text.

How It Works

Question
   │
   ▼
sentence-transformers (all-MiniLM-L6-v2)
   │  384-dim normalized embedding
   ▼
sqlite-vec cosine similarity search
   │  top-5 chunks across DMG + PHB
   ▼
Anthropic Claude Haiku 3.5
   │  answer grounded in retrieved context
   ▼
Streaming response (SSE) + source citations

This is the standard RAG (Retrieval-Augmented Generation) pattern: retrieve relevant context first, then generate an answer conditioned on that context. The model is instructed to only answer from the provided passages, which prevents hallucination and makes every answer verifiable.

Tech Stack

Component	Technology	Why
Web framework	Flask Blueprint	Integrated with existing portfolio hub
Embeddings	`all-MiniLM-L6-v2` (sentence-transformers)	Local, no API key, 80MB, 384-dim, strong semantic quality
Vector store	sqlite-vec	Single `.db` file, zero infrastructure, fits in 2GB RAM
LLM	Anthropic Claude Haiku 3.5	Fast, cheap (~$0.001/query), 200K context window
PDF processing	pypdf	Lightweight, page-aware extraction
Streaming	Flask SSE + Anthropic streaming SDK	Token-by-token response for responsive UX
Status dashboard	Anthropic Status API proxy	Live Index, Claude API, and synthesized RAG Status badges

Knowledge Base

Source	Label	Content
Dungeon Master's Guide (5e)	DMG	World-building, encounter design, magic items, monster creation
Player's Handbook (5e)	PHB	Character classes, spells, combat rules, equipment
Monster Manual (5e)	MM	Monster stat blocks, lore, creature abilities, encounter guidance

Combined: ~8,400 chunks across all three books.

Chunking Strategy

Per-page chunking — each PDF page is treated as a unit, preserving natural section boundaries
Chunk size — 1,500 characters with 300-character overlap for long pages
Metadata — each chunk stores source label (DMG/PHB) and page number for citations

Design Decisions

Why local embeddings instead of an API? all-MiniLM-L6-v2 loads once into ~80MB of RAM and runs at ~50ms/query on CPU. Eliminates a second API dependency, zero cost, and no round-trip latency for the embedding step.

Why sqlite-vec instead of ChromaDB or FAISS? The knowledge base is ~8,400 vectors — small enough that a single SQLite file is the right tool. No separate process, no serialization complexity, survives restarts, trivial to back up.

Why top-5 chunks instead of more? At 1,500 chars/chunk, top-5 puts ~7,500 characters of context into the prompt — well within Haiku's 200K token window while keeping cost and latency low. More chunks can be passed without issue if needed.

Why Claude Haiku and not Sonnet? For a well-structured RAG prompt where the answer is in the context, Haiku performs comparably to Sonnet at 10x lower cost. Sonnet is worth the upgrade for reasoning-heavy tasks; this is primarily a retrieval + synthesis task.

Live status dashboard Three metric badges show the current state at a glance: Index (chunk count), Claude (live API status fetched via a backend proxy to status.anthropic.com/api/v2/status.json, avoiding CORS), and RAG Status (synthesized — green only when both the index is ready and the Claude API is fully operational, yellow for degraded, red if either is down). The proxy polls every 60 seconds.

Project Structure

rag_chatbot/
├── __init__.py              # Blueprint, routes, background indexing thread
├── services/
│   ├── pdf_processor.py     # PDF → overlapping chunks with page metadata
│   ├── embeddings.py        # Lazy-loaded sentence-transformers wrapper
│   ├── vectorstore.py       # sqlite-vec schema, insert, KNN search
│   └── llm.py               # Anthropic streaming + 529 retry logic
├── data/
│   ├── dmg.pdf              # D&D 5e Dungeon Master's Guide (not committed)
│   ├── phb.pdf              # D&D 5e Player's Handbook (not committed)
│   └── vectors.db           # Built on first startup (not committed)
├── templates/
│   └── rag_chatbot/
│       └── index.html
└── static/
    └── script.js

Setup

Add your PDF files to data/:
- data/dmg.pdf — D&D 5e Dungeon Master's Guide
- data/phb.pdf — D&D 5e Player's Handbook
Set your Anthropic API key in .env at the project root:
```
ANTHROPIC_API_KEY=sk-ant-...
```
Start the server — indexing runs automatically in the background on first launch:
```
python app.py
```

The app detects which PDFs are present and indexes them. If you add a new PDF later, delete data/vectors.db and restart to trigger re-indexing. To add more books, add an entry to _SOURCES in __init__.py and drop the PDF in data/.

Adding More Books

Edit _SOURCES in __init__.py:

_SOURCES = [
    {'filename': 'dmg.pdf',  'label': 'DMG'},
    {'filename': 'phb.pdf',  'label': 'PHB'},
    {'filename': 'xgte.pdf', 'label': "Xanathar's"},  # add new books here
]

Then delete data/vectors.db and restart. The app re-indexes everything automatically.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
services		services
static		static
templates/rag_chatbot		templates/rag_chatbot
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

D&D RAG

How It Works

Tech Stack

Knowledge Base

Chunking Strategy

Design Decisions

Project Structure

Setup

Adding More Books

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

D&D RAG

How It Works

Tech Stack

Knowledge Base

Chunking Strategy

Design Decisions

Project Structure

Setup

Adding More Books

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages