Goodseed - Architecture & API Reference

Architecture

Storage

Each run is a single SQLite file at ~/.goodseed/projects/<project>/runs/<run_id>.sqlite.

SQLite WAL mode allows the training process to write while the server reads concurrently. On close(), the WAL is checkpointed so the result is one file.

Schema:

run_meta(key TEXT PRIMARY KEY, value TEXT)
configs(path TEXT PRIMARY KEY, type_tag TEXT, value TEXT, updated_at TEXT,
        uploaded INTEGER NOT NULL DEFAULT 0)
metric_series(id INTEGER PRIMARY KEY AUTOINCREMENT, path TEXT NOT NULL UNIQUE)
metric_points(series_id INTEGER, step REAL, y REAL, ts INTEGER,
              uploaded INTEGER NOT NULL DEFAULT 0, PRIMARY KEY (series_id, step))
string_series(id INTEGER PRIMARY KEY AUTOINCREMENT, path TEXT NOT NULL UNIQUE)
string_points(series_id INTEGER, step REAL, value TEXT, ts INTEGER,
              uploaded INTEGER NOT NULL DEFAULT 0, PRIMARY KEY (series_id, step))

The uploaded column tracks whether each row has been synced to the remote API. The background sync thread reads rows with uploaded=0, uploads them, and marks them uploaded=1 using optimistic concurrency (data values in the WHERE clause prevent marking rows overwritten since they were read).

Server

goodseed serve starts a local HTTP server (http.server.ThreadingHTTPServer, no dependencies) that scans the projects directory and serves run data as JSON.

Endpoints:

GET /api/projects                                -- list all projects
GET /api/runs                                    -- list all runs
GET /api/runs?project=<name>                     -- list runs filtered by project
GET /api/runs/<project>/<run_id>/configs          -- config key-value pairs
GET /api/runs/<project>/<run_id>/metrics          -- all metric points
GET /api/runs/<project>/<run_id>/metrics?path=loss  -- filtered by path
GET /api/runs/<project>/<run_id>/metric-paths     -- list of metric names
GET /api/runs/<project>/<run_id>/string_series    -- all string series points
GET /api/runs/<project>/<run_id>/string_series?path=<path>           -- filtered by path
GET /api/runs/<project>/<run_id>/string_series?limit=50&offset=0     -- paginated
GET /api/runs/<project>/<run_id>/string_series?tail=20               -- last N entries

CORS is enabled (Access-Control-Allow-Origin: *) so the frontend at goodseed.ai can connect.

Monitoring

Goodseed automatically captures system metrics and console output in a background thread.

Namespace: monitoring/<8-char-hash>/ where the hash is derived from hostname:pid:tid. This ensures each process gets its own namespace.

Console capture (ConsoleCaptureDaemon, 1s interval):

Wraps sys.stdout / sys.stderr with StreamWithMemory which buffers writes and forwards to the original stream
Every second, drains buffered lines and logs them as string series at monitoring/<hash>/stdout and monitoring/<hash>/stderr
Handles \r (carriage return) for progress bars — overwrites the current line

Hardware metrics (HardwareMonitorDaemon, 10s interval):

CPU and memory utilisation via psutil (optional dependency)
NVIDIA GPU via nvidia-smi --query-gpu=... --format=csv (ships with driver, no pip dependency)
AMD GPU via rocm-smi --showuse --showmemuse --showpower --json (ships with driver, no pip dependency)
Metrics: cpu, memory, gpu, gpu_memory, gpu_power (multi-GPU: gpu_0, gpu_1, ...)

Traceback capture: On exception inside a with Run(...) as run: block, the formatted traceback is logged as a string series at monitoring/<hash>/traceback and status is set to failed.

Static metadata: Logged as configs at monitoring/<hash>/hostname, monitoring/<hash>/pid, monitoring/<hash>/tid.

Run Status

Runs have a status field in run_meta: running (default), finished, or failed. Status is set when close() is called. The context manager sets failed on exception, finished otherwise.

Data Types

Configs are stored with type tags:

Python Type	type_tag	Stored as
`bool`	`"bool"`	`"true"` / `"false"`
`int`	`"int"`	`"42"`
`float`	`"float"`	`"3.14"`
`str`	`"str"`	`"hello"`
`datetime`	`"datetime"`	ISO 8601 string
`None`	`"null"`	`""`
`set`	`"string_set"`	JSON array `["a","b"]`

Metrics are always float.

API Reference

`goodseed.Run`

Run(
    name: str | None = None,            # display name (sys/name)
    description: str | None = None,     # free-form text (sys/description)
    tags: list[str] | None = None,      # tags (sys/tags)
    project: str | None = None,         # "workspace/project" format; default: GOODSEED_PROJECT or "default"
    run_id: str | None = None,          # unique ID; falls back to GOODSEED_RUN_ID env, then auto-generated
    resume_run_id: str | None = None,   # resume an existing run (mutually exclusive with run_id)
    storage: str | Storage | None = None,  # "disabled", "local", or "cloud"; env GOODSEED_STORAGE; default "cloud"
    api_key: str | None = None,         # API key; falls back to GOODSEED_API_KEY env
    read_only: bool = False,            # write methods raise; read behavior depends on storage mode
    goodseed_home: str | Path | None = None,  # override for ~/.goodseed
    log_dir: str | Path | None = None,  # override directory for the .sqlite file
    # Monitoring options (all default to True):
    capture_stdout: bool = True,        # capture print() output
    capture_stderr: bool = True,        # capture stderr output
    capture_hardware_metrics: bool = True,  # CPU, memory, GPU
    capture_traceback: bool = True,     # log traceback on exception
    monitoring_namespace: str | None = None,  # override "monitoring/<hash>"
)

Automatic namespaces

Each run automatically populates three namespaces:

Namespace	Contents
`sys/`	Run metadata: `id`, `name`, `description`, `tags`, `creation_time`, `state`
`monitoring/`	Hardware metrics (CPU, GPU), stdout/stderr streams, tracebacks
`source_code/`	Git info, diffs

The sys/state field is updated to finished or failed when the run closes.

`run["key"] = value`

Set a config value, or assign a dictionary/namespace of values under a prefix.

# Scalar values
run["score"] = 0.97
run["model_name"] = "resnet50"

# Dictionary (flattened under key as namespace)
run["parameters"] = {"lr": 0.001, "batch_size": 32}
# Stores: parameters/lr = 0.001, parameters/batch_size = 32

# Nested dictionary
run["parameters"] = {"train": {"max_epochs": 10}}
# Stores: parameters/train/max_epochs = 10

# argparse.Namespace
run["parameters"] = argparse.Namespace(lr=0.01, batch=32)
# Stores: parameters/lr = 0.01, parameters/batch = 32

# Edit sys/ fields
run["sys/name"] = "new-name"
run["sys/description"] = "updated description"

`run["key"].log(value, *, step)`

Log a value to a series. Numeric values (int, float) create metric series. String values create string series.

# Log metrics
run["train/loss"].log(0.9, step=0)
run["train/loss"].log(0.8, step=1)

# Log with custom step
run["metric"].log(value=acc, step=i)

# Log string series
run["generated_text"].log("hello world", step=0)

step is required.

`run["sys/tags"].add(value)`

Add tags to the run. Accepts a single string or a list of strings.

run["sys/tags"].add("production")
run["sys/tags"].add(["v2", "bert"])

`run.log_configs(data, flatten=True)`

Log configuration key-value pairs (batch method).

run.log_configs({"learning_rate": 0.001, "optimizer": "adam"})

# Flatten nested dicts
run.log_configs({"model": {"hidden": 256, "layers": 4}}, flatten=True)
# Stores: "model/hidden" = 256, "model/layers" = 4

`run.log_metrics(data, step)`

Log metric values at a given step (batch method).

run.log_metrics({"loss": 0.5, "accuracy": 0.85}, step=100)

step can be int or float
Same step + path overwrites the previous value
Values are coerced to float

`run.log_strings(data, step)`

Log string series values at a given step (batch method).

run.log_strings({"output": "Generated text here..."}, step=100)

step can be int or float
Same step + path overwrites the previous value
Values are coerced to str

`run.close(status='finished')`

Close the run. When remote sync is enabled, close() blocks until all remaining data is uploaded. Then checkpoints the WAL and closes the database connection.

Resuming a run

run = goodseed.Run(resume_run_id="bold-falcon")

# Continue logging
run["train/loss"].log(0.3, step=123)
run["eval/f1"] = 0.85
run.close()

The run must not be currently running (status must be finished or failed)
Auto-step state is restored from the existing data
run_id and resume_run_id are mutually exclusive

Context Manager

with goodseed.Run(name="exp") as run:
    run.log_metrics({"loss": 0.5}, step=1)
# status='finished' on normal exit, 'failed' on exception

Storage Modes

The storage parameter (or GOODSEED_STORAGE env var) controls where data is stored:

Mode	Local SQLite	Remote Sync	Remote Reads
`cloud` (default)	Yes	Yes	Yes
`local`	Yes	No	No
`disabled`	No	No	No

When read_only=True:

Mode	Writes	Local Reads	Remote Reads
`cloud`	Raise	No	Yes
`local`	Raise	Yes	No
`disabled`	Raise	Raise	Raise

Cloud Sync

When storage="cloud" (the default), a background thread uploads data to the remote API. SQLite in WAL mode serves as the durable queue between training writes and uploader reads.

Sync thread: uses threading.Thread + events. It reads unuploaded rows, sends them to the API, and marks them uploaded on confirmed success.
Close behavior: run.close() signals the sync thread to drain remaining data and blocks until upload is complete.
Shutdown behavior: per-run cleanup is registered with atexit, and context-manager usage (with Run(...)) closes runs automatically. For hard kills (for example SIGKILL), Python cleanup cannot run.
Manual upload: if sync was disabled or interrupted, use goodseed upload -p <project> [--run-id <run_id>] to upload remaining data.

Requires GOODSEED_API_KEY environment variable or api_key parameter.

Data Fetching

Read data back from the remote API using storage="cloud" with read_only=True. Read local data using storage="local" with read_only=True.

# Open a read-only handle to an existing run on the server
run = goodseed.Run(project="ws/proj", run_id="bold-falcon", api_key="gsk_...", read_only=True)

# Fetch metric paths and data
paths = run.get_metric_paths()              # ["train/loss", "train/acc"]
data = run.get_metric_data("train/loss")    # {path, downsampled, raw_points: [{step, y}]}

# Fetch string series
spaths = run.get_string_paths()             # ["notes", ...]
sdata = run.get_string_data("notes")        # [{step, value}]

# Fetch configs
configs = run.get_configs()                 # [{path, type_tag, value, updated_at}]

`run.get_metric_paths() -> list[str]`

Returns list of metric path strings available on the remote.

`run.get_metric_data(path, *, step_min=None, step_max=None, max_points=None) -> dict`

Returns dict with keys: path, downsampled, raw_points (list of {step, y}) or buckets (when downsampled).

`run.get_string_paths() -> list[str]`

Returns list of string series path strings available on the remote.

`run.get_string_data(path, *, limit=None, offset=None) -> list[dict]`

Returns list of dicts with keys step and value.

`run.get_configs() -> list[dict]`

Returns list of dicts with keys path, type_tag, value, updated_at.

Environment Variables

Variable	Default	Description
`GOODSEED_HOME`	`~/.goodseed`	Base directory for all data
`GOODSEED_PROJECT`	`default`	Default project name (`workspace/project` format for cloud storage)
`GOODSEED_RUN_ID`	—	Default run ID (overridden by `run_id` argument)
`GOODSEED_API_KEY`	—	API key for cloud storage
`GOODSEED_STORAGE`	`cloud`	Storage mode: `disabled`, `local`, or `cloud`

CLI Reference

`goodseed serve [dir] [--port PORT]`

Start the local HTTP server. dir defaults to ~/.goodseed/projects, port defaults to 8765.

`goodseed list [dir] [--project NAME]`

List projects, or runs within a specific project when --project / -p is provided.

`goodseed upload -p <project> [--run-id RUN_ID] [--api-key KEY]`

Upload unuploaded data from local run databases to the remote API. When --run-id is provided, uploads only that run; when omitted, uploads all runs in the project. Runs synchronously in the foreground. Useful for uploading data from runs where storage="local" was used, or where sync was interrupted.

File Layout

goodseed/
  src/goodseed/
    __init__.py       # exports Run, GitRef, Storage
    run.py            # Run class
    storage.py        # LocalStorage (SQLite read/write)
    server.py         # HTTP server and read-only query functions
    config.py         # Environment config and path helpers
    utils.py          # Name generation, serialization, flattening
    cli.py            # CLI entry point
    sync.py           # Background sync thread and upload_run()
    _sync_legacy.py   # Legacy Supabase sync (not used in current workflow)
    monitoring/
      __init__.py
      daemon.py             # MonitoringDaemon base class (background thread)
      console_capture.py    # StreamWithMemory, ConsoleCaptureDaemon
      hardware.py           # HardwareMonitorDaemon (CPU, memory, NVIDIA/AMD GPU)
  tests/
    conftest.py             # disables monitoring by default in tests
    test_storage.py
    test_run.py
    test_utils.py
    test_integration.py     # full workflow + HTTP server + monitoring tests
    test_cli.py
  examples/
    mlp.py                  # PyTorch MLP on synthetic data (requires torch, sklearn)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goodseed - Architecture & API Reference

Architecture

Storage

Server

Monitoring

Run Status

Data Types

API Reference

`goodseed.Run`

Automatic namespaces

`run["key"] = value`

`run["key"].log(value, *, step)`

`run["sys/tags"].add(value)`

`run.log_configs(data, flatten=True)`

`run.log_metrics(data, step)`

`run.log_strings(data, step)`

`run.close(status='finished')`

Resuming a run

Context Manager

Storage Modes

Cloud Sync

Data Fetching

`run.get_metric_paths() -> list[str]`

`run.get_metric_data(path, *, step_min=None, step_max=None, max_points=None) -> dict`

`run.get_string_paths() -> list[str]`

`run.get_string_data(path, *, limit=None, offset=None) -> list[dict]`

`run.get_configs() -> list[dict]`

Environment Variables

CLI Reference

`goodseed serve [dir] [--port PORT]`

`goodseed list [dir] [--project NAME]`

`goodseed upload -p <project> [--run-id RUN_ID] [--api-key KEY]`

File Layout

FilesExpand file tree

DOCS.md

Latest commit

History

DOCS.md

File metadata and controls

Goodseed - Architecture & API Reference

Architecture

Storage

Server

Monitoring

Run Status

Data Types

API Reference

goodseed.Run

Automatic namespaces

run["key"] = value

run["key"].log(value, *, step)

run["sys/tags"].add(value)

run.log_configs(data, flatten=True)

run.log_metrics(data, step)

run.log_strings(data, step)

run.close(status='finished')

Resuming a run

Context Manager

Storage Modes

Cloud Sync

Data Fetching

run.get_metric_paths() -> list[str]

run.get_metric_data(path, *, step_min=None, step_max=None, max_points=None) -> dict

run.get_string_paths() -> list[str]

run.get_string_data(path, *, limit=None, offset=None) -> list[dict]

run.get_configs() -> list[dict]

Environment Variables

CLI Reference

goodseed serve [dir] [--port PORT]

goodseed list [dir] [--project NAME]

goodseed upload -p <project> [--run-id RUN_ID] [--api-key KEY]

File Layout

`goodseed.Run`

`run["key"] = value`

`run["key"].log(value, *, step)`

`run["sys/tags"].add(value)`

`run.log_configs(data, flatten=True)`

`run.log_metrics(data, step)`

`run.log_strings(data, step)`

`run.close(status='finished')`

`run.get_metric_paths() -> list[str]`

`run.get_metric_data(path, *, step_min=None, step_max=None, max_points=None) -> dict`

`run.get_string_paths() -> list[str]`

`run.get_string_data(path, *, limit=None, offset=None) -> list[dict]`

`run.get_configs() -> list[dict]`

`goodseed serve [dir] [--port PORT]`

`goodseed list [dir] [--project NAME]`

`goodseed upload -p <project> [--run-id RUN_ID] [--api-key KEY]`