Usage Guide

Author: Markus van Kempen (markus.van.kempen@gmail.com) Last Updated: January 24, 2026

A comprehensive guide to using the SQL Query Generator system.

Quick Start
Using the Streamlit UI
Programmatic Usage
Query Examples
Agent Modes
Self-Improving System
Troubleshooting

Quick Start

Prerequisites

Python 3.10+
IBM WatsonX account with API credentials
SQLite (included with Python)

Installation

# Clone and setup
cd cs-sqltest-agentskills
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Configure credentials
cp .env.example .env
# Edit .env with your WatsonX credentials

# Initialize database
python scripts/init_database.py

# Run the app
streamlit run app.py

Environment Configuration

Create a .env file:

# Required
WATSONX_API_KEY=your_api_key_here
WATSONX_URL=https://us-south.ml.cloud.ibm.com
WATSONX_PROJECT_ID=your_project_id_here

# Optional
DATABASE_PATH=./data/database.db
LEARNING_DB_PATH=./data/learning.db
WATSONX_MODEL_ID=ibm/granite-4-h-small

Using the Streamlit UI

Interface Overview

┌────────────────────────────────────────────────────────────────────┐
│  SQL Query Generator                                                │
├────────────────────────────────────────────────────────────────────┤
│  Sidebar:                           Main Area:                      │
│  ┌─────────────────┐               ┌──────────────────────────────┐│
│  │ 🤖 LLM Model    │               │ Chat History                 ││
│  │ [Granite 4 ▼]   │               │                              ││
│  │                 │               │ 👤 show laptop sales in USA  ││
│  │ 🔧 Agent Mode   │               │                              ││
│  │ [BeeAI Agent ▼] │               │ 🤖 Found 5 results:          ││
│  │                 │               │ ┌────────────────────────┐   ││
│  │ 🔍 Debug Info   │               │ │ product_name | country │   ││
│  │ [Expanded]      │               │ │ Laptop Pro   | USA     │   ││
│  │                 │               │ │ ...          | ...     │   ││
│  │ Processing:     │               │ └────────────────────────┘   ││
│  │ • Query: 50ms   │               │                              ││
│  │ • LLM: 800ms    │               │ 👍 👎  Rate this result      ││
│  │ • SQL: 5ms      │               │                              ││
│  └─────────────────┘               └──────────────────────────────┘│
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐ │
│  │ Enter your query...                                    [Send] │ │
│  └───────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘

Step-by-Step Usage

Select a Model (sidebar)
- Granite 4 Small - Recommended for most queries
- Llama 3.3 70B - Best for complex reasoning
- Mistral Large - Good multilingual support
Select Agent Mode (sidebar)
- Direct SQL - Fastest, simple queries
- LangChain Agent - Best for complex queries
- BeeAI Agent - Most reliable
- Self-Improving - Learns from feedback
Enter Your Query
- Type natural language in the chat input
- Press Enter or click Send
Review Results
- View generated SQL in debug panel
- See query results in table format
- Check processing times
Provide Feedback
- Click 👍 if results are correct
- Click 👎 if results are wrong
- Feedback improves future queries

Follow-up Queries

The system understands conversational context:

You: show laptop sales
Bot: [Shows laptop sales data]

You: same for USA
Bot: [Shows laptop sales filtered to USA]

You: top 5 by revenue
Bot: [Shows top 5 USA laptop sales by revenue]

Programmatic Usage

Basic SQL Generation

from watsonx_client import WatsonXClient
from schema_loader import get_schema_string

# Initialize client
client = WatsonXClient(model_id="ibm/granite-4-h-small")

# Generate SQL
sql = client.generate_sql_query(
    user_query="show all products under $100",
    schema_info=get_schema_string()
)

print(f"Generated SQL: {sql}")
# Output: SELECT * FROM products WHERE price < 100

Using Agents with Execution

from beeai_agent import BeeAISQLAgent

# Initialize agent
agent = BeeAISQLAgent(
    model_id="ibm/granite-4-h-small",
    db_path="./data/database.db"
)

# Run query with execution
result = agent.run_query("show laptop sales by country")

if result['success']:
    print(f"SQL: {result['sql']}")
    print(f"Rows: {result['row_count']}")
    for row in result['result']:
        print(row)
else:
    print(f"Error: {result['error']}")

Using the Learning Store

from learning_store import get_learning_store

store = get_learning_store()

# Find similar patterns
patterns = store.find_similar_patterns("show laptop revenue", limit=3)
for p in patterns:
    print(f"Similar: {p['user_query']} (similarity: {p['similarity']:.2f})")

# Store a successful pattern
pattern_id = store.store_success_pattern(
    user_query="show laptop sales",
    generated_sql="SELECT * FROM sales WHERE product_name LIKE '%laptop%'",
    execution_time_ms=50
)

# Record feedback
store.record_feedback(pattern_id, is_positive=True)

Query Classification

from query_classifier import classify_query

# Database query
cls, msg = classify_query("show all products")
print(f"Classification: {cls}")  # 'database'

# General question
cls, msg = classify_query("what time is it")
print(f"Classification: {cls}")  # 'general'

# Unknown entity
cls, msg = classify_query("show BMW sales")
print(f"Classification: {cls}")  # 'unknown_entity'
print(f"Message: {msg}")  # "I couldn't find 'BMW' in the database..."

Context-Aware Queries

from context_manager import get_conversation_context

ctx = get_conversation_context()

# First query
ctx.add_query(
    query="show laptop sales",
    query_type="database",
    sql="SELECT * FROM sales WHERE product_name LIKE '%laptop%'",
    result_count=10
)

# Follow-up query
expanded = ctx.resolve_follow_up("same for USA")
print(f"Expanded: {expanded}")  # "show laptop sales in USA"

Query Examples

Product Queries

Natural Language	Generated SQL
"show all products"	`SELECT * FROM products`
"list laptop products"	`SELECT * FROM products WHERE product_name LIKE '%laptop%'`
"products under $100"	`SELECT * FROM products WHERE price < 100`
"how many products"	`SELECT COUNT(*) FROM products`

Sales Queries

Natural Language	Generated SQL
"total revenue"	`SELECT SUM(total_amount) FROM sales`
"sales by country"	`SELECT country, SUM(total_amount) FROM sales GROUP BY country`
"laptop sales in USA"	`SELECT * FROM sales WHERE product_name LIKE '%laptop%' AND country = 'USA'`
"top 5 products by sales"	`SELECT product_name, SUM(total_amount) FROM sales GROUP BY product_name ORDER BY SUM(total_amount) DESC LIMIT 5`

Customer Queries

Natural Language	Generated SQL
"list all customers"	`SELECT * FROM customers`
"customers from USA"	`SELECT * FROM customers WHERE country = 'USA'`
"how many customers in Europe"	`SELECT COUNT(*) FROM customers WHERE region = 'Europe'`

Complex Queries

Natural Language	Generated SQL
"revenue per country for laptops"	`SELECT country, SUM(total_amount) FROM sales WHERE product_name LIKE '%laptop%' GROUP BY country`
"customers who bought smartphones"	`SELECT DISTINCT c.* FROM customers c JOIN orders o ON c.customer_id = o.customer_id JOIN products p ON o.product_id = p.product_id WHERE p.product_name LIKE '%smartphone%'`

Agent Modes

Mode Comparison

Mode	Speed	Accuracy	Learning	Best For
Direct SQL	⚡⚡⚡	⭐⭐	✅	Simple queries, speed-critical
LangChain	⚡⚡	⭐⭐⭐	✅	Complex queries, auto-retry
BeeAI	⚡⚡	⭐⭐⭐	✅	Reliability, native WatsonX
Self-Improving	⚡	⭐⭐⭐⭐	✅✅✅	Continuous improvement

When to Use Each Mode

Direct SQL (watsonx_client.py)

Dashboard queries with known patterns
High-volume, simple queries
When you need fastest response

LangChain Agent (langchain_agent.py)

Complex joins and aggregations
Queries that might need retry
When schema changes frequently

BeeAI Agent (beeai_agent.py)

Production deployments
When using other BeeAI features
Best balance of speed/accuracy

Self-Improving (self_improving_agent.py)

Learning from user patterns
Building a knowledge base
When feedback is available

Self-Improving System

How It Works

Query Embedding: Your query is converted to a 384-dimensional vector
Pattern Search: Find similar successful queries in the learning store
Few-Shot Prompting: Include similar examples in the LLM prompt
Error Avoidance: Include common mistakes to avoid
Result Storage: Store successful patterns for future use
Feedback Loop: User 👍/👎 improves pattern quality

Viewing Learning Statistics

from learning_store import get_learning_store

store = get_learning_store()
stats = store.get_stats()

print(f"Successful patterns: {stats['total_patterns']}")
print(f"Error patterns: {stats['total_errors']}")
print(f"Average rating: {stats['avg_rating']:.2f}")

Resetting the Learning Store

import os
os.remove("./data/learning.db")
# The store will reinitialize on next use

Troubleshooting

Common Issues

"Missing WatsonX credentials"

Solution: Check your .env file contains:
- WATSONX_API_KEY
- WATSONX_URL
- WATSONX_PROJECT_ID

"Database not found"

# Reinitialize the database
python scripts/init_database.py

"SQL Error: no such column"

This usually means the LLM used a wrong column name.
Solution: Try rephrasing your query or use a more capable model (Llama 70B)

"Query returns no results"

The system will analyze why:
- Check if the entity exists: "show BMW sales" → "BMW not in database"
- Check filters: The data might not have matches for your criteria

Slow Response Times

Solutions:
1. Use a faster model (Granite 4 Small vs Llama 70B)
2. Use Direct SQL mode for simple queries
3. Check if embedding cache is working (see debug panel)

Debug Information

Enable detailed logging:

from logging_config import setup_logging
setup_logging(level="DEBUG")

View processing steps in UI:

Expand "🔍 Debug Info" in sidebar
See timing breakdown for each step
View the actual LLM prompt sent
See similar patterns found

Getting Help

Check ARCHITECTURE.md for system design
Check API.md for module documentation
Check COMPARISON_RESULTS.md for test results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage Guide

Table of Contents

Quick Start

Prerequisites

Installation

Environment Configuration

Using the Streamlit UI

Interface Overview

Step-by-Step Usage

Follow-up Queries

Programmatic Usage

Basic SQL Generation

Using Agents with Execution

Using the Learning Store

Query Classification

Context-Aware Queries

Query Examples

Product Queries

Sales Queries

Customer Queries

Complex Queries

Agent Modes

Mode Comparison

When to Use Each Mode

Self-Improving System

How It Works

Viewing Learning Statistics

Resetting the Learning Store

Troubleshooting

Common Issues

"Missing WatsonX credentials"

"Database not found"

"SQL Error: no such column"

"Query returns no results"

Slow Response Times

Debug Information

Getting Help

FilesExpand file tree

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

Usage Guide

Table of Contents

Quick Start

Prerequisites

Installation

Environment Configuration

Using the Streamlit UI

Interface Overview

Step-by-Step Usage

Follow-up Queries

Programmatic Usage

Basic SQL Generation

Using Agents with Execution

Using the Learning Store

Query Classification

Context-Aware Queries

Query Examples

Product Queries

Sales Queries

Customer Queries

Complex Queries

Agent Modes

Mode Comparison

When to Use Each Mode

Self-Improving System

How It Works

Viewing Learning Statistics

Resetting the Learning Store

Troubleshooting

Common Issues

"Missing WatsonX credentials"

"Database not found"

"SQL Error: no such column"

"Query returns no results"

Slow Response Times

Debug Information

Getting Help