A streamlined Docker Compose stack for running a local AI environment with Ollama, Kokoro TTS, and Open WebUI.
- Ollama: Run large language models (like Llama 3.2) locally.
- Kokoro TTS: High-quality, fast text-to-speech engine.
- Open WebUI: A comprehensive web interface for interacting with your AI models (currently optional/commented out).
- Timezone Support: Pre-configured for
America/Los_Angeles.
- Docker
- Docker Compose
- (Optional) NVIDIA GPU with NVIDIA Container Toolkit for hardware acceleration.
-
Clone the repository:
git clone <repository-url> cd ai-stack
-
Configure Environment: Copy the example environment file and adjust values as needed.
cp .env.example .env
Note: Ensure you set your
HF_TOKENif using Open WebUI. -
Start the Stack:
docker compose up -d
| Service | Host Port | Internal Port | Description |
|---|---|---|---|
| Ollama | 11434 |
11434 |
LLM Backend |
| Kokoro TTS | 8880 |
8880 |
Text-to-Speech API |
| Open WebUI | 3000 |
8080 |
Web Interface (Optional) |
The stack is configured to pull the model specified by LLM_MODEL (default: llama3.2) on startup if using the ollama-init service.
To enable GPU support for Ollama, uncomment the deploy section in docker-compose.yaml:
# ...
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
# ...All services are configured to use the America/Los_Angeles timezone by default. You can change this in the environment section of each service in docker-compose.yaml.