Cud (the bolus of food that llamas chew) is a local-first, multi-agent framework where simplicity is the ultimate foundation.
Designed to be lightweight, straightforward, and incredibly easy to use, Cud brings the power of autonomous AI agents directly to your local machine.
- π― Simplicity First: No convoluted setups or bloated abstractions. Cud is built to be intuitive, readable, and easy to hack on.
- π Local & Private: Powered by Ollama, your data, prompts, and memory stay 100% on your machine.
- π€ Multi-Agent Architecture: Create a fleet of specialized agents, each with its own persona, memory, and workspace.
- π Transparent: Agent behavior is defined in plain Markdown (
AGENT.md). No hidden prompts, no black boxes. - π οΈ Tool-Rich: Built-in support for persistent shell sessions, surgical filesystem operations, Custom SKILLs, and the Model Context Protocol (MCP).
- π» Daemon-Ready: Seamlessly run your agents as background services using
systemd.
Cud relies on Ollama for local inference.
Important
Tool Calling Support is Required!
Because Cud agents interact with your local environment (shell, filesystem, etc.), you must download and use models in Ollama that explicitly support tool calling (e.g., gpt-oss:20b, gemma4:e4b, qwen3.6:27b).
Requires Python 3.11+.
pipx install git+https://github.com/arrase/cud.gitEnsure you have Ollama installed and a tool-calling capable model downloaded:
ollama run gemma4:e4b# Create an agent named "researcher"
cud agent create researcher
# Configure it to use a specific model with tool calling support
cud agent config researcher --model gemma4:e4b
# Setup your Discord token
cud gateway setup researcher discord --token YOUR_BOT_TOKEN# Start it as a background service
cud gateway start researcherYou can also interact with your agent locally through a rich terminal interface.
# Start the local chat REPL
cud tui researcherInside the TUI, type /help to see available commands, or /quit to exit.
When interacting with your agent via the Discord Gateway, you can use the following slash commands:
/new: Start a new Cud session in the current Discord thread (clears context history)./model <model_name>: Temporarily switch the agent's configured model./usage: Show Cud runtime usage summary (agent name, current model, and thread ID)./undo: Remove the last exchange from the current thread./reload: Reload tools and the system prompt (AGENT.md) for this agent./memory view: View the contents of the agent's long-termMEMORY.md./memory clear: Clear the agent's long-termMEMORY.md.
Every agent lives in ~/.cud/agents/<name>/. This directory contains its entire "soul":
- βοΈ
settings.yaml: Model parameters and tool configurations. - π
AGENT.md: The system promptβdefining its persona and rules. - π
MEMORY.md: Long-term memory that the agent can read and update. - πΎ
history.db: A SQLite-backed checkpointer for conversation state. - π
mcp.json: MCP server configurations. - π»
workspace/: The dedicated directory where the agent runs commands and edits files. - π§°
workspace/skills/: A directory for custom Markdown-defined abilities.
Skills are portable sets of instructions and tools. Just drop a folder with a SKILL.md into an agent's workspace/skills/ directory, and it instantly gains those capabilities.
To keep agents fast and prevent them from hitting token limits during long conversations, Cud features automatic context compression.
- Automatic Summarization: When a conversation gets too long, older messages are automatically summarized by the LLM.
- No Data Lost: The full, uncompressed conversation history is safely offloaded to a markdown file in the agent's workspace (
workspace/conversation_history/<thread_id>.md). - Manual Compaction: Agents have access to a
compact_conversationtool, allowing them to proactively free up context when finishing a large task.
Agents can execute scheduled, periodic tasks autonomously. Tasks are defined as Markdown files (TASK.md) located in the agent's workspace/tasks/<name>/ directory.
- Use a YAML frontmatter to configure the
schedule(cron expression) and the destination (channel_idoruser_id). - The rest of the file is the prompt the agent executes.
- Ask your agent to create tasks for you, or edit them manually.
- Use the
/reloadcommand in Discord to activate changes. - Check active tasks via the CLI:
cud task list <agent>.
Example: workspace/tasks/daily-news/TASK.md
---
name: "Daily Tech News"
description: "Searches for latest AI news and summarizes it"
schedule: "0 9 * * *"
channel_id: 123456789012345678
enabled: true
---
Search the web for the latest AI news.
Summarize the top 3 stories.
Make it sound enthusiastic!Native support for MCP allows you to connect your agents to external tool servers (e.g., Brave Search, GitHub, Google Drive) with a single command.
For HTTP/SSE servers:
cud mcp add researcher https://mcp-server.example.com/sse --name searchFor stdio servers (with arguments and environment variables):
cud mcp add researcher "npx -y @modelcontextprotocol/server-postgres postgresql://localhost/mydb" \
--name db-search \
--env POSTGRES_PASSWORD=secretDefine specialized subagents that your main agent can delegate tasks to. Each subagent has its own isolated context, skills, and MCP servers β keeping the orchestrator's context clean and focused.
Configure them in settings.yaml:
subagents:
- name: "research-agent"
description: "Delegate here for complex research or web searches."
system_prompt: "You are a research specialist. Search thoroughly and return concise summaries."
model: "gemma4:e4b" # Optional, inherits from main agent
context_window: 65536 # Optional, inherits from main agent
skills_paths:
- "./workspace/skills/research"
mcp_servers:
- name: "brave-search"
command: "npx"
args: ["-y", "@modelcontextprotocol/server-brave-search"]
env:
BRAVE_API_KEY: "${BRAVE_API_KEY}"
- name: "database-agent"
description: "Delegate here when the user asks about customer or sales data."
system_prompt: "You are a database analyst. Query the database and summarize results."
skills_paths:
- "./workspace/skills/database"
mcp_servers:
- name: "sqlite-server"
command: "uvx"
args: ["mcp-server-sqlite", "--db-path", "./data/ventas.db"]- Context isolation: Subagent tool calls don't bloat the main agent's context β only the final result is returned.
- Environment injection: Use
${VAR_NAME}in MCPenvfields to inject secrets from the host environment. - Graceful failures: If a subagent's MCP server fails to load (e.g., missing env vars), it is skipped and the agent continues normally.
Cud is built on a robust, modular stack:
- Orchestration: DeepAgents provides the stateful, cyclic reasoning loops.
- LLM Engine: Ollama powers the local inference with tool calling capabilities.
To set up a local development environment:
# Clone the repository
git clone https://github.com/arrase/cud.git
cd cud
# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate
# Install development dependencies
pip install -e .MIT License.