Skip to content

Token-as-a-Service - CloudSigma sovereign LLM inference platform

Notifications You must be signed in to change notification settings

cloudsigma/taas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

TaaS — Token-as-a-Service

CloudSigma's sovereign cloud LLM inference platform. OpenAI-compatible API with local data residency.

Architecture

Clients (OpenAI SDK compatible)
         │
    Caddy (TLS + API key auth + logging)
         │
    ├── /v1/chat/completions
    ├── /v1/completions  
    ├── /v1/models
    └── /v1/embeddings
         │
    vLLM Backends (GPU inference)
    ├── DeepSeek V3 685B (4×B200, TP=4) — port 8001
    ├── Qwen 2.5 72B (1×B200)           — port 8002
    ├── Qwen2.5-Coder 32B (1×B200)      — port 8003
    └── BGE-M3 Embeddings (CPU)          — port 8004

Quick Start

# Install
pip install vllm

# Start a model
vllm serve deepseek-ai/DeepSeek-V3 \
  --tensor-parallel-size 4 \
  --port 8001

# Use it
curl http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"deepseek-ai/DeepSeek-V3","messages":[{"role":"user","content":"Hello"}]}'

Components

  • deploy/ — Deployment scripts and systemd services
  • gateway/ — Caddy configuration and API key middleware
  • admin/ — Admin CLI for key management
  • docs/ — Architecture and operational docs

Status

Phase 1 MVP — In Development

About

Token-as-a-Service - CloudSigma sovereign LLM inference platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published