feat(coder/modules/agentapi): add state persistence by mafredri · Pull Request #736 · coder/registry

mafredri · 2026-02-19T17:20:28Z

AgentAPI can save and restore conversation state across workspace restarts.
The base module exports env vars (AGENTAPI_STATE_FILE, AGENTAPI_SAVE_STATE,
AGENTAPI_LOAD_STATE, AGENTAPI_PID_FILE) that the binary reads directly.
No consumer module start scripts need changes.

New variables:

enable_state_persistence (bool, default true)
state_file_path (string, defaults to $HOME/<module_dir_name>/state.json)
pid_file_path (string, defaults to $HOME/<module_dir_name>/agentapi.pid)

State persistence requires agentapi >= v0.12.0. A shared version_at_least
function in scripts/lib.sh gates both the env var exports in main.sh and
SIGUSR1 in the shutdown script. Old binaries get a warning and graceful
skip instead of breakage.

Shutdown script now performs a three-phase shutdown:

SIGUSR1 to trigger state save (gated on version + persistence enabled)
Log snapshot capture (existing behavior, now fault-tolerant via subshell)
SIGTERM for graceful termination with wait loop

Also bumps agentapi module version to 2.2.0 and claude-code to 4.7.6.

Closes coder/internal#1257
Refs coder/internal#1256
Refs #696

johnstcn · 2026-02-19T17:26:18Z

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh

+    log "Sending SIGUSR1 to AgentAPI (pid $agentapi_pid) to save state"
+    kill -USR1 "$agentapi_pid" || true
+    # Allow time for state save to complete before proceeding.
+    sleep 1


If we know the path at which the state file should be written, we can sleep a bit longer and validate that it was written.

It could be an old one so we'd have to check timestamp.

But really, we don't care about confirming that it was written here as the TERM + wait covers that. It's more of an early invoke in case the workspace gets killed before we complete all the work here. The sleep is there to allow the serialization to initiate before we grab the messages via API call below.

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh

mafredri · 2026-02-20T10:48:13Z

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh

+
+# Source shared utilities (written by the coder_script wrapper).
+# shellcheck source=lib.sh
+source /tmp/agentapi-lib.sh


Review: Following existing pattern here but why don't we write all scripts to the provided module_dir?

@mafredri It honestly might be better to do this. As for why I am not entirely sure. I know that some modules do write logs and config files to the module dir.

It might be worth me going through and standardizing this across all of the modules.

mafredri · 2026-02-20T12:01:36Z

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh

+    agentapi_pid=$(cat "$PID_FILE_PATH" 2> /dev/null || echo "")
+  fi
+
+  # State persistence is only enabled when the binary supports it (>= v0.12.0).


Review: We'll have to confirm this ends up being the version (cc: @35C4n0r)

AgentAPI can save and restore conversation state across workspace restarts. The base module exports env vars (AGENTAPI_STATE_FILE, AGENTAPI_SAVE_STATE, AGENTAPI_LOAD_STATE, AGENTAPI_PID_FILE) that the binary reads directly. No consumer module start scripts need changes. New variables: - enable_state_persistence (bool, default true) - state_file_path (string, defaults to $HOME/<module_dir_name>/state.json) - pid_file_path (string, defaults to $HOME/<module_dir_name>/agentapi.pid) State persistence requires agentapi >= v0.12.0. A shared version_at_least function in scripts/lib.sh gates both the env var exports in main.sh and SIGUSR1 in the shutdown script. The version is queried from the real binary (agentapi --version) rather than the Terraform variable, so it works correctly when install_agentapi is false. Shutdown script now performs a three-phase shutdown: 1. SIGUSR1 to trigger state save (gated on version + persistence enabled) 2. Log snapshot capture (existing behavior, now fault-tolerant via subshell) 3. SIGTERM for graceful termination with wait loop Also bumps agentapi module version to 2.2.0. Refs: internal#1257, internal#1256, registry#696

DevelopmentCats

LGTM aside from what has already been pointed out.

@35C4n0r You might have some better insight on this since agentapi.

35C4n0r

Code LGTM!
I'd test it once before approving.

35C4n0r · 2026-02-23T03:33:43Z

registry/coder/modules/agentapi/main.tf

+  default     = true
+}
+
+variable "state_file_path" {


For consistency, it'd be nice if we set the default for state_file_path and pid_file_path in the locals block below, rather than in the main.sh.

I considered this, but we currently build the module path dynamically in scripts (module_path="$HOME/${MODULE_DIR_NAME}"). To avoid inconsistency or broken changes in the future, we should have only one way to construct these paths, either script (maybe lib.sh?) or move to terraform. Might be a better fit for a refactor.

It's unfortunate that in terraform we lack the ability to do runtime evaluation, nor do we generally have a "safe" way to expose these to the shell script (quoting edge cases, etc, something I'd like to improve one day).

No consumer modules ship agentapi >= v0.12.0 yet, so the feature was silently skipped at runtime while emitting a misleading warning on every workspace start. Modules can opt in explicitly when ready.

mafredri · 2026-02-24T12:29:38Z

FYI I flipped enable_state_persistence to default false. The feature would be silently skipped anyway while emitting a confusing warning on every start. I think it's better if modules opt in when they bump (as claude-code does in #749).

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh

registry/coder/modules/agentapi/scripts/lib.sh

registry/coder/modules/agentapi/README.md

johnstcn

LGTM, nice work!

The subshell EXIT trap references $tmpdir which is local to capture_task_log_snapshot. When the trap fires during subshell exit, the variable is out of scope, causing a nounset error after the snapshot posts successfully.

It seems that agentapi can block for longer than 5s sometimes: ``` Shutting down AgentAPI Sending SIGUSR1 to AgentAPI (pid 44876) to save state Fetching messages from AgentAPI on port 3284 curl: (28) Operation timed out after 5001 milliseconds with 0 bytes received Error: Failed to fetch messages from AgentAPI (may not be running) Error: Cannot capture log snapshot without messages Log snapshot capture failed, continuing shutdown Sending SIGTERM to AgentAPI (pid 44876) Warning: AgentAPI (pid 44876) still running after 5s Shutdown complete ```

johnstcn reviewed Feb 19, 2026

View reviewed changes

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh Outdated Show resolved Hide resolved

mafredri force-pushed the agentapi-state-4nzp branch 3 times, most recently from 636bc8c to 955441d Compare February 20, 2026 11:50

mafredri changed the title ~~feat(registry/coder/modules/agentapi): add state persistence~~ feat(coder/modules/agentapi): add state persistence Feb 20, 2026

mafredri force-pushed the agentapi-state-4nzp branch 5 times, most recently from 001c89d to 8a32b14 Compare February 20, 2026 12:00

mafredri marked this pull request as ready for review February 20, 2026 12:00

mafredri commented Feb 20, 2026

View reviewed changes

mafredri force-pushed the agentapi-state-4nzp branch from 8a32b14 to 1435939 Compare February 20, 2026 12:16

mafredri requested review from 35C4n0r and DevelopmentCats February 20, 2026 12:27

mafredri mentioned this pull request Feb 20, 2026

Research: AI agent state persistence compatibility coder/internal#1329

Closed

6 tasks

DevelopmentCats approved these changes Feb 20, 2026

View reviewed changes

35C4n0r reviewed Feb 23, 2026

View reviewed changes

mafredri added 2 commits February 24, 2026 11:53

fix: change state file name

d0cbc01

agentapi: default enable_state_persistence to false

d871ce8

No consumer modules ship agentapi >= v0.12.0 yet, so the feature was silently skipped at runtime while emitting a misleading warning on every workspace start. Modules can opt in explicitly when ready.

johnstcn reviewed Feb 24, 2026

View reviewed changes

registry/coder/modules/agentapi/scripts/agentapi-shutdown.sh Outdated Show resolved Hide resolved

fix: change default in scripts too

389955d

johnstcn reviewed Feb 24, 2026

View reviewed changes

registry/coder/modules/agentapi/scripts/lib.sh Show resolved Hide resolved

johnstcn reviewed Feb 24, 2026

View reviewed changes

registry/coder/modules/agentapi/README.md Outdated Show resolved Hide resolved

johnstcn approved these changes Feb 24, 2026

View reviewed changes

mafredri and others added 5 commits February 24, 2026 12:40

agentapi: update README for default false state persistence

a817fa8

agentapi: clarify arithmetic exit status in version_at_least

e7233b3

agentapi: fix unbound tmpdir in shutdown EXIT trap

aa27b0b

The subshell EXIT trap references $tmpdir which is local to capture_task_log_snapshot. When the trap fires during subshell exit, the variable is out of scope, causing a nounset error after the snapshot posts successfully.

Merge branch 'main' into agentapi-state-4nzp

ed51abe

Conversation

mafredri commented Feb 19, 2026

Uh oh!

johnstcn Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

mafredri Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mafredri Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

DevelopmentCats Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

mafredri Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

DevelopmentCats left a comment

Choose a reason for hiding this comment

Uh oh!

35C4n0r left a comment

Choose a reason for hiding this comment

Uh oh!

35C4n0r Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

mafredri Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mafredri commented Feb 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

johnstcn left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mafredri Feb 24, 2026 •

edited

Loading