Build end-to-end data products on Databricks with AI-assisted development
This template is a complete data product builder powered by vibe coding. Start from a raw customer schema or existing data, and build your way through the full Databricks stack — Databricks Apps with Lakebase, medallion architecture (Bronze, Silver, Gold), semantic layer, Genie Spaces, ML pipelines, and GenAI agents — all guided by 50 agent skills and your AI coding assistant.
Vibe Coding is an AI-assisted development approach where you collaborate with AI tools (like Cursor, GitHub Copilot, Claude Code, Windsurf, or similar) to rapidly build, iterate, and deploy production-quality data products. Instead of writing every line from scratch, you describe what you want and let the AI handle the implementation — guided by structured agent skills that encode best practices.
Workshop participants: See PRE-REQUISITES.md for the full setup checklist (workspace access, CLI, IDE, and authentication).
| Tool | Required | Installation |
|---|---|---|
| Databricks Workspace | Yes | Access to a Databricks workspace with Unity Catalog |
| Databricks CLI | Yes | curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh |
| AI-Powered IDE | Yes | Cursor, Windsurf, VS Code + Copilot, or similar |
| Python 3.10+ | Yes | python.org |
| Git | Yes | git-scm.com |
git clone https://github.com/databricks-solutions/vibe-coding-workshop-template.git my-project
cd my-projectGet a FastAPI backend with Lakebase (managed PostgreSQL) running on Databricks Apps:
cd apps_lakebase
./scripts/setup.sh # Configure auth + install deps
./scripts/deploy.sh --create # Deploy to Databricks AppsAfter deployment, open your app URL in the browser.
Take a raw schema CSV through the full medallion architecture to production AI agents — one prompt per stage:
- Drop your schema CSV into
data_product_accelerator/context/ - Open your AI coding assistant and prompt:
I have a customer schema at @data_product_accelerator/context/Wanderbricks_Schema.csv.
Please design the Gold layer using @data_product_accelerator/skills/gold/00-gold-layer-design/SKILL.md
- Follow the 9-stage pipeline guide — one prompt per stage, one new conversation per stage.
Both paths work together. Build your data pipeline first, then deploy a Databricks App on top of it — or start with the app and add data products incrementally.
vibe-coding-workshop-template/
│
├── README.md # This file
├── QUICKSTART.md # Quick-start guide with two pathways
├── AGENTS.md # AI assistant routing (universal entry point)
├── PRE-REQUISITES.md # Workshop prerequisites checklist
├── CONTRIBUTING.md # Contribution guidelines
├── LICENSE.md # License
├── SECURITY.md # Security policy
├── env.example # Environment variable template
│
├── apps_lakebase/ # Databricks App (FastAPI + Lakebase)
│ ├── app.yaml # Databricks App entry point config
│ ├── pyproject.toml # Python dependencies
│ ├── server/ # FastAPI backend
│ │ ├── app.py # Main application
│ │ └── routers/
│ │ ├── health.py # Health/readiness endpoints
│ │ └── api.py # API endpoints (workspace, catalogs, query)
│ ├── scripts/ # Development & deployment scripts
│ │ ├── setup.sh # Configure auth & install deps
│ │ ├── deploy.sh # Deploy to Databricks Apps
│ │ ├── watch.sh # Local dev server (hot reload)
│ │ ├── run_local.sh # Test locally before deploying
│ │ ├── app_status.sh # Check deployed app status
│ │ ├── setup-lakebase.sh # Set up Lakebase tables
│ │ └── lakebase_manager.py # Lakebase connectivity & permissions
│ └── lakebase/
│ └── README.md # Lakebase DDL/DML reference
│
├── data_product_accelerator/ # 50 Agent Skills for End-to-End Data Products
│ ├── AGENTS.md # Detailed skill routing table
│ ├── QUICKSTART.md # One-prompt-per-stage guide (9 stages)
│ ├── README.md # Accelerator overview
│ ├── context/ # Schema CSV inputs (starting point)
│ │ ├── Wanderbricks_Schema.csv
│ │ └── booking_app_schema.csv
│ ├── skills/ # 50 skills across 12 domains
│ │ ├── admin/ # Skill creation, auditing (4)
│ │ ├── bronze/ # Bronze layer + Faker data (2)
│ │ ├── common/ # Cross-cutting shared skills (8)
│ │ ├── exploration/ # Ad-hoc notebooks (1)
│ │ ├── genai-agents/ # GenAI agent patterns (9)
│ │ ├── gold/ # Gold design + implementation (9)
│ │ ├── ml/ # MLflow pipelines (1)
│ │ ├── monitoring/ # Monitors, dashboards, alerts (5)
│ │ ├── planning/ # Project planning (1)
│ │ ├── semantic-layer/ # Metric Views, TVFs, Genie (6)
│ │ ├── silver/ # DLT pipelines, DQ rules (3)
│ │ └── skill-navigator/ # Master routing system (1)
│ └── docs/ # Framework design documentation
│
└── agentic-framework/ # Multi-Agent Build Framework
├── agents/ # Agent prompts for building multi-agent systems
│ ├── prd-analyzer.md # Parse PRDs, map to agent capabilities
│ ├── skill-scaffolder.md # Create new Agent Skills (SKILL.md)
│ ├── tool-builder.md # Build runtime Python tools
│ ├── agent-tester.md # Configure agent behavior tests
│ ├── agent-ui-wiring-prompt.md # Guide agent-to-UI wiring
│ ├── multi-agent-build-prompt.md # Orchestrator build with Foundation Models
│ ├── databricks-deployer.md # Deployment guidance
│ └── prd-template.md # PRD template
└── skills/
└── foundation-model-agent-loop/
└── SKILL.md # Tool-calling loop with Foundation Models
This template supports a unified workflow from raw data to production data products:
Raw Schema CSV or Existing Data
│
├─► Gold Design — dimensional model, ERDs, YAML schemas
├─► Bronze Layer — source tables + test data (Faker)
├─► Silver Layer — DLT pipelines + data quality expectations
├─► Gold Layer — tables, MERGE scripts, FK constraints
├─► Semantic Layer — Metric Views, TVFs, Genie Spaces
├─► Observability — Lakehouse Monitors, AI/BI Dashboards, SQL Alerts
├─► ML Pipelines — MLflow experiments, training, inference
├─► GenAI Agents — ResponsesAgent, evaluation, deployment
│
└─► Databricks App — FastAPI + Lakebase, deployed on Databricks Apps
Each stage is driven by a single prompt to your AI coding assistant. The 50 agent skills in data_product_accelerator/ encode production-tested patterns so you get governed, high-quality output at every step.
The data_product_accelerator/ directory contains 50 agent skills organized by domain that guide your AI assistant through the entire pipeline:
| Domain | Skills | Focus |
|---|---|---|
| Gold | 9 | Dimensional modeling, ERDs, YAML schemas, MERGE scripts |
| GenAI Agents | 9 | ResponsesAgent, evaluation, deployment |
| Common | 8 | Asset Bundles, naming, constraints, imports |
| Semantic Layer | 6 | Metric Views, TVFs, Genie Spaces |
| Monitoring | 5 | Lakehouse Monitors, dashboards, SQL alerts |
| Admin | 4 | Skill creation, auditing, docs |
| Silver | 3 | DLT pipelines, expectations, DQX |
| Bronze | 2 | Bronze tables, Faker data generation |
| ML | 1 | MLflow pipelines |
| Planning | 1 | Project planning |
| Exploration | 1 | Ad-hoc notebooks |
| Skill Navigator | 1 | Master routing system |
See data_product_accelerator/QUICKSTART.md for the step-by-step 9-stage guide.
The apps_lakebase/ directory contains a production-ready FastAPI backend with Lakebase (managed PostgreSQL) integration, deployable to Databricks Apps.
Key features:
- FastAPI with structured logging and CORS
- Health, readiness, and liveness endpoints
- Unity Catalog integration (catalogs, schemas, tables, SQL queries)
- Lakebase PostgreSQL connectivity
- Optional frontend serving (React/Vite from
client/build) - Deploy scripts for Databricks Apps platform
All commands run from the apps_lakebase/ directory:
| Command | Description |
|---|---|
./scripts/setup.sh |
Configure authentication and install dependencies |
./scripts/deploy.sh --create |
First deployment (creates the app) |
./scripts/deploy.sh |
Update existing app |
./scripts/deploy.sh --verbose |
Deploy with detailed output |
./scripts/watch.sh |
Start local dev server with hot reload |
./scripts/run_local.sh |
Test app locally (production mode) |
./scripts/setup-lakebase.sh |
Set up Lakebase tables and permissions |
./scripts/app_status.sh |
Check app status and URL |
| Service | URL |
|---|---|
| API | http://localhost:8000 |
| API Documentation | http://localhost:8000/docs |
| Health Check | http://localhost:8000/health |
| Readiness Check | http://localhost:8000/ready |
| Frontend (if enabled) | http://localhost:5173 |
The agentic-framework/ directory provides prompts and patterns for building multi-agent systems with Databricks Foundation Models.
Agent prompts (in agentic-framework/agents/):
| Agent | Purpose |
|---|---|
| prd-analyzer | Parse PRDs, map requirements to agent capabilities |
| skill-scaffolder | Create new Agent Skills (SKILL.md) for any domain |
| tool-builder | Build runtime Python tools for agents |
| agent-tester | Configure tests for agent behavior |
| agent-ui-wiring-prompt | Guide agent-to-UI integration |
| multi-agent-build-prompt | Build multi-agent orchestrators with Foundation Models |
| databricks-deployer | Deployment guidance for agents |
| prd-template | PRD template for agent projects |
Foundation Model Agent Loop (in agentic-framework/skills/foundation-model-agent-loop/):
- Pattern for tool-calling loops with Databricks Foundation Models (e.g.,
databricks-meta-llama-3-3-70b-instruct) - Supports function calling without custom model deployment
When you run ./scripts/deploy.sh from apps_lakebase/:
┌────────────────────────────────────────────────────────────────────┐
│ Deployment Process │
├────────────────────────────────────────────────────────────────────┤
│ │
│ 1. BUILD 2. GENERATE 3. SYNC 4. DEPLOY │
│ ──────────── ──────────── ──────────── ────────── │
│ Frontend requirements.txt Files to Start app │
│ (if exists) from pyproject Workspace runtime │
│ │
│ client/dist/ ──► requirements.txt ──► /Workspace/... ──► App URL │
│ │
│ 5. LAKEBASE (optional) │
│ ────────────────────── │
│ Set permissions + create tables │
│ │
└────────────────────────────────────────────────────────────────────┘
| File | Purpose |
|---|---|
apps_lakebase/app.yaml |
Defines how Databricks starts your app |
apps_lakebase/pyproject.toml |
Python dependencies |
apps_lakebase/server/app.py |
Your FastAPI application |
- Go to your Databricks workspace
- Click your username > User Settings > Developer > Access Tokens
- Generate a new token
- Run
./scripts/setup.sh(fromapps_lakebase/) and enter your token
- Configure a CLI profile:
databricks auth login --host https://your-workspace.cloud.databricks.com --profile myprofile
- Run
./scripts/setup.shand select "CLI Profile" - Enter your profile name
Edit apps_lakebase/server/routers/api.py:
from fastapi import APIRouter
router = APIRouter()
@router.get("/hello")
async def hello():
return {"message": "Hello from Databricks!"}
@router.get("/data/{table_name}")
async def get_data(table_name: str):
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
# Query your data...
return {"table": table_name}-
Edit
apps_lakebase/pyproject.toml:dependencies = [ "fastapi>=0.109.0", "your-new-package>=1.0.0", # Add here ]
-
Deploy:
cd apps_lakebase ./scripts/deploy.sh
# Make sure to use --create on first deployment
cd apps_lakebase
./scripts/deploy.sh --create# Reconfigure authentication
rm .env.local
./scripts/setup.sh# Verify CLI is working
databricks --version
databricks current-user me- Get your app URL:
./scripts/app_status.sh - Open
<app-url>/logzin your browser (requires auth)
# Test locally before deploying
cd apps_lakebase
./scripts/run_local.sh
# Open http://localhost:8000- PRE-REQUISITES.md — Workshop prerequisites checklist
- Data Product Accelerator QUICKSTART — 9-stage pipeline guide
- Databricks Apps Documentation
- Databricks CLI Installation
- Databricks SDK for Python
- FastAPI Documentation
- Agent Skills (SKILL.md) Format
- Unity Catalog
- Delta Live Tables
- Metric Views
This is a Git template repository. To use it:
- Click "Use this template" on GitHub, or clone directly
- Choose your starting point:
- Deploy an app:
cd apps_lakebase && ./scripts/setup.sh && ./scripts/deploy.sh --create - Build a data product: Drop a schema CSV in
data_product_accelerator/context/and follow the 9-stage guide - Build agents: Use the prompts in
agentic-framework/agents/to scaffold multi-agent systems
- Deploy an app:
- Iterate with your AI coding assistant — the agent skills handle the patterns
Ready to build? Let's go!
git clone https://github.com/databricks-solutions/vibe-coding-workshop-template.git my-project
cd my-project