Skip to content

feat: collapse root action.yml into single node24 bundle (dist/main.js)#182

Merged
derekmisler merged 10 commits intodocker:mainfrom
docker-agent:feat/node24-main-bundle
May 7, 2026
Merged

feat: collapse root action.yml into single node24 bundle (dist/main.js)#182
derekmisler merged 10 commits intodocker:mainfrom
docker-agent:feat/node24-main-bundle

Conversation

@docker-agent
Copy link
Copy Markdown
Contributor

Summary

Replaces the 872-line composite action.yml (bash steps) with a using: node24 action backed by a new TypeScript bundle dist/main.js. All 24 inputs and 10 outputs are preserved verbatim — this is a zero-breaking-change refactor.

Motivation

The composite-action-with-bash-steps model makes the logic hard to test, hard to type-check, and impossible to cover with unit tests. Moving to a single Node.js bundle enables:

  • Full Vitest unit coverage of every auth/exec/output path
  • Static type-checking via tsc --noEmit
  • Simpler debugging (single JS bundle, real stack traces)
  • No external bash dependencies at runtime

Changes

action.yml

Rewritten from 872 lines of composite bash steps to 5 lines:

runs:
  using: node24
  main: dist/main.js

All 24 inputs and 10 outputs are unchanged (public contract preserved).

New modules in src/main/

File Description
index.ts Orchestrates all original steps in order
auth.ts 4-tier authorization waterfall (skip-auth → trusted-bot bypass → org-membership → author_association)
binary.ts Downloads docker-agent + mcp-gateway via @actions/tool-cache with caching
exec.ts Spawns docker-agent via child_process with retry loop, timeout (SIGTERM/SIGKILL), stdin prompt
outputs.ts TypeScript port of awk state-machine: strips tool-call noise, extracts docker-agent-output blocks
artifact.ts Uploads verbose log via @actions/artifact DefaultArtifactClient
summary.ts Writes GitHub Actions job summary via core.summary

Test coverage

  • src/main/__tests__/outputs.test.ts — 28 tests for the awk filter + docker-agent-output extractor
  • src/main/__tests__/auth.test.ts — 15 tests for the 4-tier auth waterfall (mocked Octokit/org-membership)

New deps

  • @actions/exec@3 — binary verification
  • @actions/tool-cache@4 — download + cache docker-agent binary
  • @actions/artifact@6 — upload verbose log artifact

Security properties preserved

  • All API keys masked with core.setSecret() before any exec
  • Keys in env, never argv (no eval anywhere)
  • sanitizeInput / sanitizeOutput / checkAuth imported from existing src/security/ modules (unmodified)
  • 4-tier auth waterfall is a faithful port of the original bash logic, including the trusted-bot PAT bypass

Test results

Test Files  12 passed (12)
Tests       143 passed (143)

pnpm lint (biome + tsc + actionlint) passes cleanly.

Caveats / deferred items

  • No post: step — deferred per plan. Cleanup runs inline at end of main step.
  • dist/ not committed — gitignored for PRs per AGENTS.md; release.yml commits it for tagged releases.
  • binary.ts uses tool-cache for download caching (replaces actions/cache@v4 binary caching).

Refs: migration plan artifact 15eb0c67, inventory artifact 8981dc56

@docker-agent docker-agent requested a review from a team as a code owner May 6, 2026 19:15
@docker-agent docker-agent requested a review from derekmisler May 6, 2026 19:15
@docker-agent
Copy link
Copy Markdown
Contributor

docker-agent Bot commented May 6, 2026

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

Copy link
Copy Markdown
Contributor Author

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🔴 CRITICAL

1 high-severity security regression + 2 notable + 2 minor issues found in the action.yml → node24 port. See inline comments for details.

Comment thread src/main/index.ts Outdated
Comment thread src/main/binary.ts Outdated
Comment thread src/main/auth.ts
core.error(`❌ Authorization failed: @${commentUserLogin} is not a ${authOrg} org member`);
return { authorized: false, outcome: 'denied' };
}
} catch (err: unknown) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Tier 3 org membership check: any exception results in hard deny with no Tier 4 fallback

When orgMembershipToken and authOrg are configured, any exception (network timeout, invalid token, rate limit) in the checkOrgMembership call returns { authorized: false, outcome: 'denied' } and does not fall through to Tier 4 (author_association check):

} catch (err: unknown) {
  core.error(`Org membership check failed: ${(err as Error).message}`);
  return { authorized: false, outcome: 'denied' };  // hard deny
}

This means a valid org member can be silently denied due to a transient API failure, with no recovery path and no retry. The original bash had the same behavior (exit on API failure), but this PR is a good opportunity to improve availability by falling through to Tier 4 on non-fatal errors, or at minimum adding a configurable retry.

Specific concern: checkOrgMembership makes an HTTP call to https://api.github.com/orgs/${org}/members/${user}. A 502 or 503 from the GitHub API will throw, and a legitimate contributor's PR will fail auth. Today's GitHub Actions ecosystem sees periodic API degradations — this is a real-world risk.

Suggested fix: On transient HTTP errors (5xx, network timeout), fall through to Tier 4 rather than hard-denying, or throw so the caller can add the step-level continue-on-error behaviour:

} catch (err: unknown) {
  core.warning(`Org membership check failed, falling through to author_association: ${(err as Error).message}`);
  // fall through to Tier 4
}

Comment thread src/main/outputs.ts Outdated
Comment thread src/main/index.ts Outdated
@derekmisler
Copy link
Copy Markdown
Contributor

/review

@docker-agent
Copy link
Copy Markdown
Contributor

docker-agent Bot commented May 6, 2026

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

Copy link
Copy Markdown
Contributor Author

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

All 5 findings from the previous review are correctly fixed. The new test suite (203 tests) is well-structured — but 3 of the 5 fixes lack regression guards that would catch a future revert. See inline comments for specific gaps and suggested additions.


// ── Wiring — sanitizeOutput runs before block extraction ─────────────────────

describe('security pipeline ordering (FIX 1)', () => {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] FIX 1 integration test doesn't catch the security ordering regression it was written to guard

The test 'sanitizeOutput scans full filtered output before block extraction narrows it' writes verbose log content with no leaked secret:

fsSync.appendFileSync(
  capturedVerboseLog,
  // ← No secret pattern here — sanitizeOutput is a no-op regardless of call order
  '--- Agent: root ---\nConversational text.\n```docker-agent-output\n## Result\n\nClean output.\n```\n',
);

Because there is no secret to detect, sanitizeOutput is a no-op in both the correct order and a regressed order. The test's final assertion:

expect(outputCalls['secrets-detected']).toBe('false');  // trivially true either way

is satisfied whether sanitizeOutput runs before or after extractDockerAgentOutputBlock.

What a meaningful regression test looks like: include a known secret pattern in the conversational text before the block (e.g., sk-ant-api03-XXXXXXXX), and assert secrets-detected === 'true'. Under the old (broken) order, that content is stripped by block extraction before sanitizeOutput sees it → secrets-detected === 'false' → test fails. Under the new (correct) order, sanitizeOutput scans the full filtered text → detects the key → secrets-detected === 'true' → test passes.

Without this, the highest-severity fix from the previous review has no regression guard.

expect(result.outcome).toBe('denied');
});

it('denies when org membership check throws', async () => {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Tier 3 fallthrough test is vacuous — passes under both old (hard deny) and new (fallthrough) code

The test 'denies when org membership check throws' uses author_association: 'NONE':

it('denies when org membership check throws', async () => {
  await writePayload({
    comment: { author_association: 'NONE', user: { login: 'outsider' } },
  });
  mockCheckOrgMembership.mockRejectedValue(new Error('Token invalid'));
  // ...
  expect(result.authorized).toBe(false);
  expect(result.outcome).toBe('denied');
});

With NONE association, Tier 4 (checkAuth('NONE', ['OWNER', 'MEMBER', 'COLLABORATOR'])) also returns falseoutcome: 'denied'. So the test outcome is identical whether the catch hard-denies (old behavior) or falls through to Tier 4 (new behavior). Reverting the fix to a hard deny would not break this test.

The missing case that actually validates the fallthrough:

it('falls through to author_association when org check throws', async () => {
  await writePayload({
    comment: { author_association: 'OWNER', user: { login: 'repo-owner' } },
  });
  mockGetAuthenticated.mockResolvedValue({ data: { login: 'bot' } });
  mockCheckOrgMembership.mockRejectedValue(new Error('Network timeout'));

  const result = await checkAuthorization({
    ...opts,
    orgMembershipToken: 'org-token',
    authOrg: 'my-org',
  });
  // Old code: hard deny → authorized=false. New code: Tier 4 OWNER → authorized=true.
  expect(result.authorized).toBe(true);
  expect(result.outcome).toBe('author-association');
});

This case distinguishes the two behaviors and confirms the Tier 3 catch actually falls through.

expect(result).toContain('Actual answer');
});

it('strips --- Tool: blocks', () => {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] No test for the FIX 4 behavior: blank line in inTool state is dropped (not emitted)

The existing 'strips --- Tool: blocks' test terminates inTool state via a --- Agent: header, not a blank line:

it('strips --- Tool: blocks', () => {
  const input = '--- Tool: read_file ---\nsome tool internals\n--- Agent: root ---\nClean output\n';
  //                                                           ↑ exits inTool via header, not blank line
  const result = filterAgentOutput(input);
  expect(result).toContain('Clean output');
});

The fix (commit b7a5bb4) changed the blank-line exit from a fall-through (which would emit the blank line) to a continue (which drops it, matching awk next):

if (line.trim() === '') {
  state = 'normal';
  continue; // ← the fix. Before: fell through and blank line was emitted
}

No test exercises this path. Reverting continue back to a fall-through would emit an extra blank line after tool blocks, but the existing tests would not catch it.

Suggested addition:

it('drops blank line that terminates an inTool block (matches awk `next`)', () => {
  const input = [
    '--- Tool: bash ---',
    'some tool output',
    '',              // ← blank line exits inTool, should be dropped
    'Clean output',  // ← back in normal state
  ].join('\n');

  const result = filterAgentOutput(input);
  // Blank line must NOT appear — it was consumed by the inTool exit
  const lines = result.split('\n');
  // 'Clean output' should not be preceded by a blank line
  const cleanIdx = lines.findIndex(l => l === 'Clean output');
  expect(cleanIdx).toBeGreaterThan(-1);
  expect(lines[cleanIdx - 1]).not.toBe('');
});

expect(writtenData.toString()).toContain('Raw prompt');
});

it('returns TIMEOUT_EXIT_CODE (124) without retrying on timeout', async () => {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Timeout test bypasses the real timer → SIGTERM path; child.kill is never called in the test

The test achieves exit code 124 by making the mock child emit close(124) immediately — bypassing the real timeout mechanism:

it('returns TIMEOUT_EXIT_CODE (124) without retrying on timeout', async () => {
  mockSpawn.mockReturnValue(makeMockChild(TIMEOUT_EXIT_CODE)); // child immediately emits close(124)
  const result = await runAgent(baseOpts({ timeout: 5, maxRetries: 3 }));
  expect(result.exitCode).toBe(TIMEOUT_EXIT_CODE);
  expect(mockSpawn).toHaveBeenCalledOnce(); // ✓ no retry on 124
  // child.kill was never called — timer never fired
});

The real timeout path (spawnAgent):

const timer = setTimeout(() => {
  timedOut = true;
  try { child.kill('SIGTERM'); } catch {}
  setTimeout(() => { try { child.kill('SIGKILL'); } catch {} }, 5000);
  resolve(TIMEOUT_EXIT_CODE);
}, opts.timeoutSeconds * 1000);

The test correctly validates the no-retry-on-124 invariant, but does not verify that:

  • child.kill('SIGTERM') is called when the timer fires
  • The timedOut flag gates the resolve path correctly

Suggested complement (uses makeMockChild's delayMs parameter, which already exists):

it('kills process with SIGTERM when timeout fires', async () => {
  const child = makeMockChild(0, 5000); // child would exit in 5s naturally
  mockSpawn.mockReturnValue(child);

  const result = await runAgent(baseOpts({ timeout: 0.05, maxRetries: 0 })); // 50ms timeout
  
  expect(result.exitCode).toBe(TIMEOUT_EXIT_CODE);
  expect(child.kill).toHaveBeenCalledWith('SIGTERM');
}, 2000);

Comment thread src/main/binary.ts Outdated
// ── 2. Remote @actions/cache hit (cross-run persistence) ───────────────
const tmpBinDir = await fs.promises.mkdtemp(path.join(os.tmpdir(), 'docker-agent-'));
const cacheKey = `docker-agent-${toolName}-${version}-${platform}-${arch}`;
const restoredKey = await actionsCache.restoreCache([tmpBinDir], cacheKey);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] restoreCache is not wrapped in try/catch, creating asymmetric error handling vs saveCache

saveCache is wrapped and non-fatal (logs a warning and continues). restoreCache is not:

// restoreCache — any exception propagates and aborts binary setup:
const restoredKey = await actionsCache.restoreCache([tmpBinDir], cacheKey);

// saveCache — wrapped (non-fatal):
try {
  await actionsCache.saveCache([tmpBinDir], cacheKey);
} catch (err: unknown) {
  core.warning(`Failed to save docker-agent to remote cache: ${(err as Error).message}`);
}

@actions/cache typically returns undefined (cache miss) rather than throwing when the cache service is unavailable. However, in certain conditions — permission errors, ACTIONS_CACHE_URL misconfiguration, or future API changes — it may throw. If it does, the exception propagates out of ensureDockerAgent, skipping the download fallback and failing the entire action.

The same pattern applies to ensureMcpGateway (line 164).

Suggested fix (consistent with saveCache's pattern):

let restoredKey: string | undefined;
try {
  restoredKey = await actionsCache.restoreCache([tmpBinDir], cacheKey);
} catch (err: unknown) {
  core.warning(`Remote cache restore failed (${(err as Error).message}); falling back to download`);
}

This ensures that a cache service blip never blocks a binary download.

Copy link
Copy Markdown
Contributor Author

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

All 10 findings across two review rounds (5 runtime bugs, 5 test-quality gaps) are fully addressed. Every fix is correct, and each regression guard is load-bearing — verified to fail on the un-fixed code.

Round 1 runtime fixes (b7a5bb4) — ✅ all verified:

  • FIX 1 · sanitizeOutput before block extraction — security ordering preserved
  • FIX 2 · Two-level cache: remote @actions/cache + local tool-cache
  • FIX 3 · Tier 3 org-membership catch falls through to Tier 4
  • FIX 4 · inTool blank-line exit uses continue (matches awk next)
  • FIX 5 · skip-summary uses core.getBooleanInput

Round 2 regression-guard fixes (5054132) — ✅ all verified load-bearing:

  • FIX A · Integration test includes real sk-ant-api03- key; asserts secrets-detected=true — fails if ordering is reverted
  • FIX B · Auth test uses OWNER + org throws → authorized=true expected — fails if Tier 3 hard-deny is restored
  • FIX C · Outputs test checks blank line is absent after inTool block — fails if continue reverts to fall-through
  • FIX D · makeMockChild.kill emits close; timeout test asserts child.kill('SIGTERM') called — exercises real timer path
  • FIX E · Both restoreCache calls (docker-agent + mcp-gateway) wrapped in try/catch — matching saveCache pattern

Replace the 872-line composite action with a `using: node24` action backed by a new TypeScript bundle `dist/main.js`.

Key changes:
- action.yml: rewritten to `using: node24, main: dist/main.js` — all 24 inputs and 10 outputs preserved verbatim
- src/main/index.ts: orchestrates all original steps (validation, auth, input sanitization, binary setup, agent execution, output processing, output sanitization, artifact upload, job summary)
- src/main/auth.ts: 4-tier authorization waterfall (skip-auth → trusted-bot bypass → org-membership → author_association)
- src/main/binary.ts: downloads docker-agent (and optionally mcp-gateway) via @actions/tool-cache with caching
- src/main/exec.ts: spawns docker-agent via child_process with retry loop, timeout (SIGTERM/SIGKILL), stdin prompt, API keys via env only (never argv)
- src/main/outputs.ts: TypeScript port of the awk state-machine that strips tool-call noise from verbose agent logs; extracts docker-agent-output fenced blocks
- src/main/artifact.ts: uploads verbose log via @actions/artifact DefaultArtifactClient
- src/main/summary.ts: writes GitHub Actions job summary via core.summary
- New deps: @actions/exec, @actions/tool-cache, @actions/artifact
- tsup.config.ts: add 'main' entry point
- All 143 unit tests pass; pnpm lint clean

Security properties preserved:
- All API keys masked with core.setSecret() before any exec
- Keys in env only, never argv (no eval)
- sanitizeInput / sanitizeOutput / checkAuth imported from existing src/security/ modules
- Authorization 4-tier waterfall identical to original

Assisted-By: docker-agent
FIX 1 (security): correct sanitizeOutput ordering in index.ts
- Split processAgentOutput into two separate phases
- Step 8: filterAgentOutput writes FULL filtered text to outputFile
- Step 9: sanitizeOutput scans the full filtered text (before block extraction)
- Step 9b: extractDockerAgentOutputBlock runs AFTER sanitize; skipped on leak
- Leaked keys in conversational text are now caught even when a clean block follows

FIX 2 (performance): add @actions/cache remote persistence to binary.ts
- Add @actions/cache@6 dependency
- restoreCache() before download attempt; saveCache() after successful download
- Applies to both docker-agent and mcp-gateway binaries
- tc.find/cacheDir kept for local in-process hits; @actions/cache for cross-run
- saveCache failures are non-fatal (read-only in forked PRs)

FIX 3 (auth): Tier 3 exception falls through to Tier 4 instead of hard-denying
- On network error / 5xx in checkOrgMembership, emit core.warning and fall through
- Tier 4 (author_association) now runs when Tier 3 throws, not just when skipped
- Valid contributors with OWNER/MEMBER/COLLABORATOR association are no longer blocked

FIX 4 (outputs): inTool blank-line exit now continues (drops line)
- Split the combined if-condition into two separate branches
- Blank-line path: state=normal + continue (matches awk `next`, drops the line)
- Header line path: state=normal + fall-through (re-evaluates the header line)

FIX 5 (robustness): replace case-sensitive skip-summary check
- Remove unused _skipSummary variable from try block
- Use core.getBooleanInput('skip-summary') in finally block (handles 'True'/'TRUE'/'1')

Assisted-By: docker-agent
New test files:
- src/main/__tests__/artifact.test.ts  (8 tests)  — makeArtifactName + uploadVerboseLog
- src/main/__tests__/binary.test.ts    (9 tests)  — detectPlatform, local/remote-cache hits, full download, mcp-gateway
- src/main/__tests__/exec.test.ts      (17 tests) — buildArgs (pure) + runAgent with mocked spawn
- src/main/__tests__/summary.test.ts   (12 tests) — writeJobSummary status lines, table, output section
- src/main/__tests__/main.integration.test.ts (13 tests) — full run() pipeline via mocked externals

Also export 'run' from index.ts and guard auto-invocation with process.env.VITEST so
the integration test can import and call run() without triggering action side-effects.

Totals: 189 unit tests (16 files), 14 integration tests (1 file) — all passing.

Assisted-By: docker-agent
pnpm format + biome check --unsafe applied to:
- src/main/__tests__/artifact.test.ts
- src/main/__tests__/binary.test.ts
- src/main/__tests__/exec.test.ts
- src/main/__tests__/main.integration.test.ts

Changes: line wrapping (100-col), import sort order, useLiteralKeys,
removed unused import (fsSync in summary.test.ts was already clean).

Assisted-By: docker-agent
FIX A (integration test — security pipeline ordering):
  Replace vacuous conversational-only content with a real Anthropic API key
  (matching /sk-ant-[a-zA-Z0-9_-]{30,}/) in the verbose log before a clean
  docker-agent-output block. Test now asserts secrets-detected=true and
  security-blocked=true, which ONLY passes under the correct order
  (filter→sanitize→extract). Also asserts outputFile retains the leaked
  key (Step 9b is skipped on leak detection).

FIX B (auth.test.ts — Tier 3 fallthrough is distinguishable):
  Add 'falls through to author_association when org check throws (FIX 3)'.
  Uses OWNER association so old hard-deny code would return denied but
  new fallthrough code reaches Tier 4 and returns authorized=true,
  outcome='author-association'. This test fails if the FIX 3 catch block
  is reverted to return { authorized: false }.

FIX C (outputs.test.ts — blank-line inTool exit):
  Add 'drops blank line that terminates an inTool block (matches awk next)'.
  Checks that the line immediately before 'Clean output' is not blank,
  which fails if the continue is reverted to fall-through.

FIX D (exec.test.ts — SIGTERM kill path):
  Add 'kills process with SIGTERM when timeout fires'. Uses a 50ms action
  timeout with a child that would naturally exit in 5s. Updated makeMockChild
  so kill() emits close via setImmediate (simulates OS killing the process),
  allowing the timer path to complete within the 2s test timeout.
  Asserts child.kill was called with 'SIGTERM'.

FIX E (binary.ts — restoreCache non-fatal):
  Wrap both restoreCache calls (docker-agent and mcp-gateway) in try/catch
  with core.warning + fallthrough to download, matching the existing
  saveCache non-fatal pattern. A network error in restoreCache no longer
  fails the action.

Assisted-By: docker-agent
@docker-agent docker-agent force-pushed the feat/node24-main-bundle branch from d91e92a to fd4a5e2 Compare May 7, 2026 14:12
The test was hardcoding 'v1.54.0' in two places: the fake binary
content and the cagent-version output assertion. When upstream bumped
DOCKER_AGENT_VERSION to v1.56.0 (via the rebase), the assertion broke.

Fix: read DOCKER_AGENT_VERSION from the canonical file at the start of
the test module so both the fake binary echo and the assertion always
reflect the pinned version. This makes the test resilient to future
version bumps without requiring manual test updates.

Assisted-By: docker-agent
Copy link
Copy Markdown
Contributor Author

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

Round 3 review — post-rebase sanity check

This PR was reviewed after rebasing onto 6 upstream commits (v1.54.0→v1.56.0 bump, inline mention-reply fixes, self-review local action paths).

Rebase interaction analysis: Clean — the 6 upstream commits modified src/mention-reply/, .github/actions/mention-reply/, and review-pr/ (none of which this PR touches). No file conflicts, no behavioral interactions. The version bump is automatically picked up via readDockerAgentVersion().

Overall: The TypeScript port is sound. All security primitives are correct (secrets masked before exec, sanitizeOutput always runs in finally, prompt injection detection preserved). Two findings surfaced:

  1. 🔴 [HIGH] Auth catch swallows HTTP 401 → fails open on revoked tokensrc/main/auth.ts line 129
  2. 🟡 [MEDIUM] Retry output accumulationsrc/main/exec.ts line 296

See inline comments for details and suggested fixes.

Comment thread src/main/auth.ts
// rather than hard-denying a valid contributor.
core.warning(
`Org membership check failed (${(err as Error).message}); falling back to author_association`,
);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Auth Tier 3 catch falls through to Tier 4 on ALL exceptions — fails open when token is revoked

The catch block here silences all exceptions from checkOrgMembership (network errors, 5xx, and HTTP 401) and falls through to the weaker author_association Tier 4 check.

checkOrgMembership throws on HTTP 401 with message "Org membership token is missing or invalid". When the org-membership-token is revoked or compromised, the original bash hard-denied (exit 1). The new code instead falls through to Tier 4, allowing a user with author_association of OWNER/MEMBER/COLLABORATOR to be authorized via the weaker fallback — even though the primary membership gate has failed.

The comment says "Network error / 5xx" but the catch applies to 401 as well.

Suggested fix: rethrow on 401 (token invalid) — only fall through for transient errors:

} catch (err: unknown) {
  const status = (err as { status?: number }).status;
  if (status === 401) {
    // Token is invalid/revoked — hard deny, don't fall through
    core.error(`Org membership token is invalid (HTTP 401): ${(err as Error).message}`);
    return { authorized: false, outcome: 'denied' };
  }
  // Network / 5xx: warn and fall through to Tier 4
  core.warning(`Org membership check failed (${(err as Error).message}); falling back to author_association`);
}

Rebase note: This is not introduced by the rebase — it's present in the PR's own code and was intentionally labeled "FIX 3" in tests. Worth revisiting whether 401 should also fail-closed per the project's security-first design.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 764558f. The catch block now inspects err.status:

  • HTTP 401 (revoked/invalid token) → core.error + hard-deny ({ authorized: false, outcome: 'denied' }). Does not fall through to the weaker Tier 4 author_association check.
  • Non-401 (network error, 5xx, timeout) → core.warning + fall through to Tier 4 as before.

Two new/updated tests in src/main/__tests__/auth.test.ts:

  1. hard-denies when org membership token returns HTTP 401 (does not fall through to Tier 4) — uses OWNER association, which Tier 4 would authorize; the denied outcome proves the hard-deny fired instead of falling through.
  2. falls through to author_association when org check throws a non-401 error — explicitly uses a plain Error with no .status + OWNER association, confirming the fallthrough path still works.

Comment thread src/main/exec.ts
'',
].join(os.EOL);
fs.appendFileSync(opts.verboseLogFile, separator, 'utf-8');
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Retry loop appends to verbose log without resetting output state — all retry attempts' content ends up in the output file

The verbose log is intentionally cumulative (with separators), which is correct. But the clean output file in the original bash was truncated (> "$OUTPUT_FILE") before each retry attempt, ensuring downstream consumers only ever saw the last attempt's output.

In the new code, filterAgentOutput is called on the full verbose log after all retries complete in index.ts:

const rawVerbose = fs.readFileSync(verboseLogFile, 'utf-8');
const filteredOutput = filterAgentOutput(rawVerbose);  // ALL attempts
fs.writeFileSync(outputFile, filteredOutput, 'utf-8');

The extractDockerAgentOutputBlock extractor finds the first ```docker-agent-output fence and extracts until the first closing ```. If a failed attempt 1 emits a partial ```docker-agent-output block without a closing fence, attempt 2's closing ``` will terminate it — producing truncated/incorrect extracted output.

For most transient retry scenarios (API timeouts, transient errors) the risk is low since the failing agent is unlikely to emit a partial structured block. However the behavioral change from the original is real.

Suggested fix: After all retries in runAgent or before calling filterAgentOutput in index.ts, trim the verbose log content to only the final attempt's section (after the last RETRY ATTEMPT separator):

// In index.ts, before filterAgentOutput:
const rawVerbose = fs.readFileSync(verboseLogFile, 'utf-8');
const lastAttemptMarker = /^={10,} RETRY ATTEMPT \d+/m;
const parts = rawVerbose.split(lastAttemptMarker);
const lastAttemptContent = parts[parts.length - 1];
const filteredOutput = filterAgentOutput(lastAttemptContent);

Or pass outputFile path to runAgent so it can be reset per-retry (matching the original bash behavior).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 764558f. Before calling filterAgentOutput, the verbose log is now split on the retry-attempt marker (/^={10,} RETRY ATTEMPT \d+/m) and only the last section is filtered:

const lastAttemptMarker = /^={10,} RETRY ATTEMPT \d+/m;
const parts = rawVerbose.split(lastAttemptMarker);
const lastAttemptContent = parts[parts.length - 1];
const filteredOutput = filterAgentOutput(lastAttemptContent);

When there are no retries parts.length === 1 and parts[0] is the full log — correct behavior preserved.

Two new tests added to src/main/__tests__/main.integration.test.ts (describe block retry output trimming (FIX 2)):

  1. Writes a verbose log with a partial first-attempt docker-agent-output block followed by a retry marker and the correct second-attempt block; asserts the output file contains only the second-attempt content.
  2. Writes a verbose log with no retry markers; asserts the full content passes through unchanged.

Copy link
Copy Markdown
Contributor Author

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🔴 CRITICAL — FIX 1 is incomplete; PR is not ready to merge

Round 4 review — verifying commit 764558f

FIX 2 (retry output accumulation) ✅ — Correct and complete. The lastAttemptMarker regex split before filterAgentOutput mirrors the original bash > "$OUTPUT_FILE" truncation. No-retry case (single part) and multi-retry case both handled correctly. Integration tests cover both paths.

FIX 1 (auth 401 hard-deny) ❌ — The guard fires in unit tests but will never fire in production. The .status property is stripped when checkOrgMembership converts the Octokit RequestError (which has .status = 401) into a plain new Error('...HTTP 401...') before rethrowing. By the time auth.ts catches it, (err as {status?:number}).status is undefined, not 401. The unit test passes only because it mocks checkOrgMembership directly with an Object.assign(new Error(), {status:401}) error, which has a shape the real implementation never produces.

The original vulnerability is still present: on a revoked/invalid org-membership-token, the action silently falls through to Tier 4 (author_association) rather than hard-denying.

See inline comments for the exact call chain and two concrete fix options.

Comment thread src/main/auth.ts
}
} catch (err: unknown) {
const status = (err as { status?: number }).status;
if (status === 401) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] FIX 1 is incomplete — .status is stripped before this guard is reached

This guard will never trigger in production. Here's the full call chain:

  1. GitHub API → throws Octokit.RequestError with .status = 401
  2. checkOrgMembership (in src/check-org-membership/index.ts line 32–37) catches it, then rethrows as a plain new Error('...HTTP 401...') — losing .status entirely:
    if (status === 401) {
      throw new Error('Org membership token is missing or invalid (HTTP 401). ...');
      // ↑ plain Error — .status is undefined on this object
    }
  3. Here in auth.ts(err as { status?: number }).status evaluates to undefined, so status === 401 is always false. The code falls through to Tier 4 exactly as before.

The unit test at auth.test.ts line 270 passes only because it mocks checkOrgMembership directly with Object.assign(new Error(), { status: 401 }), bypassing the real conversion. That error shape is never produced by the actual checkOrgMembership implementation.

Two correct fixes — pick one:

Option A — message-based guard (no changes to other files):

} catch (err: unknown) {
  const status = (err as { status?: number }).status;
  const msg = (err as Error).message ?? '';
  if (status === 401 || msg.includes('HTTP 401')) {
    core.error(`Org membership token is invalid: ${msg}`);
    return { authorized: false, outcome: 'denied' };
  }
  // Network / 5xx: fall through to Tier 4
  core.warning(`Org membership check failed (${msg}); falling back to author_association`);
}

Option B — preserve .status in check-org-membership/index.ts:

if (status === 401) {
  throw Object.assign(
    new Error('Org membership token is missing or invalid (HTTP 401). ...'),
    { status: 401 }
  );
}

Then update the test mock to match: mockRejectedValue(Object.assign(new Error('...'), { status: 401 })) — which it already does, so the test remains valid.

Option B is cleaner (the error carries its own status), but Option A works too. Either way the test at auth.test.ts:270 is testing the right behavior — just with an error shape that doesn't match what the real code produces.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 95052ef. Used Object.assign(new Error(...), { status: 401 }) so the thrown error preserves .status — matching exactly the shape auth.ts Tier 3 guards on. Also updated the 401 test in src/check-org-membership/__tests__/check-org-membership.test.ts to assert .status === 401 on the rethrown error, not just the message. All 221 unit tests green.

});
mockGetAuthenticated.mockResolvedValue({ data: { login: 'bot' } });
const err401 = Object.assign(new Error('Unauthorized'), { status: 401 });
mockCheckOrgMembership.mockRejectedValue(err401);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Test passes but validates the wrong error shape

const err401 = Object.assign(new Error('Unauthorized'), { status: 401 });
mockCheckOrgMembership.mockRejectedValue(err401);

This mock constructs an error with .status = 401, which makes auth.ts's status === 401 guard fire. But the real checkOrgMembership on an HTTP 401 throws new Error('Org membership token is missing or invalid (HTTP 401)...') — a plain Error with no .status property.

The test correctly covers the intended behavior (401 → hard deny), but because the mock bypasses checkOrgMembership's internal conversion, it doesn't catch the gap between the expected and actual error shapes.

Fix: Once check-org-membership/index.ts is updated to preserve .status (Option B above), this mock is already correct and no test changes are needed. If using the message-based Option A instead, the mock can stay as-is — the behavior test remains valid either way.

Copy link
Copy Markdown
Contributor Author

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

Round 5 review — verifying commit 95052ef

FIX 1 is now complete and correct. All prior findings are resolved. The PR is ready to merge.


FIX 1 (auth 401 hard-deny) ✅ — Root cause addressed in the right place.

check-org-membership/index.ts now rethrows the 401 error with .status preserved:

throw Object.assign(
  new Error('Org membership token is missing or invalid (HTTP 401). ...'),
  { status: 401 }   // ← .status preserved through the rethrow
);

The full call chain now works as intended:

  • GitHub API → Octokit.RequestError with .status = 401
  • checkOrgMembership catches, rethrows with .status = 401 preserved ✅
  • auth.ts catch → status === 401 is true → hard deny, no fallthrough to Tier 4 ✅

The test in check-org-membership/__tests__/ now asserts .status === 401 on the rethrown error (in addition to the message check), closing the gap that let the previous attempt pass tests while failing in production. The shape of the mock in auth.test.ts (Object.assign(new Error(), { status: 401 })) now correctly matches what checkOrgMembership actually throws.

FIX 2 (retry output accumulation) ✅ — Verified correct in round 4, unchanged.

All CI checks pass (221 unit + 16 integration). No new issues introduced.

@derekmisler derekmisler enabled auto-merge (squash) May 7, 2026 15:15
@derekmisler derekmisler merged commit f7b4181 into docker:main May 7, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants