findingFromOutput signature is not stable across re-reviews: JSON.stringify(evidence) is key-order-dependent

## Summary

`findingFromOutput` (`src/findings.ts:49`) computes the dedup signature by passing the LLM-returned `evidence` array through `JSON.stringify` directly. JSON property order is not guaranteed across runs, so the same logical finding can produce different `signature` hashes and therefore different `findingId`s on re-review. The intended dedup-on-`mergeFinding` lookup (`src/app.ts:702` → `readFinding(paths, finding.findingId)`) silently misses, and a second copy of the finding is written to `.clawpatch/findings/`.

## Repro

```bash
node -e "
const crypto = require('crypto');
const stableId = (prefix, parts) => {
  const hash = crypto.createHash('sha256').update(parts.join('\0')).digest('hex').slice(0, 10);
  return prefix + '_feat_' + hash;
};

// Same evidence, three plausible orderings from an LLM:
const e1 = [{path:'a.ts',startLine:1,endLine:5,symbol:null,quote:null}];
const e2 = [{startLine:1,path:'a.ts',symbol:null,quote:null,endLine:5}];
const e3 = [{path:'a.ts',endLine:5,startLine:1,symbol:null,quote:null}];

for (const e of [e1, e2, e3]) {
  console.log(stableId('sig', ['feat','bug','title', JSON.stringify(e)]));
}
"
```

Output:
```
sig_feat_ffd1e3d0d8
sig_feat_1693cfae2c
sig_feat_160b4f4aaf
```

Three IDs for the same evidence.

## Why it matters

`clawpatch review` is meant to be re-runnable. Documented quick-start:

```
clawpatch review --limit 10
# (later)
clawpatch review --limit 10
```

If the LLM returns evidence with a slightly different key order on the second run (very common — LLMs are not deterministic about JSON key emission across separate calls, even at temperature 0), the second review writes a NEW finding file for an already-known issue. Over time, `.clawpatch/findings/` accumulates duplicates of the same logical finding, the report grows misleading, and `clawpatch fix --finding <id>` can target a stale duplicate.

Confirmed against `clawpatch@0.3.0`, current `main` HEAD `6fdd8fa`. `src/findings.ts` and `src/id.ts` have not been touched since initial commit.

## Suggested fix

Canonicalize the evidence stringification before hashing. Three viable approaches:

**(A) Sort keys at the leaf** — cheapest, narrowest scope:

```ts
// src/findings.ts
function canonicalEvidence(evidence: ReviewOutput["findings"][number]["evidence"]): string {
  return JSON.stringify(
    evidence.map((e) => ({
      path: e.path,
      startLine: e.startLine,
      endLine: e.endLine,
      symbol: e.symbol,
      quote: e.quote,
    }))
  );
}

// ...
const signature = stableId("sig", [
  featureId,
  finding.category,
  finding.title,
  canonicalEvidence(finding.evidence),
]);
```

**(B) Generic canonical-stringify** — slightly more general, useful if other stableId callers grow:

```ts
function canonicalStringify(value: unknown): string {
  if (Array.isArray(value)) return `[${value.map(canonicalStringify).join(",")}]`;
  if (value !== null && typeof value === "object") {
    const entries = Object.entries(value as object).toSorted(([a], [b]) => a.localeCompare(b));
    return `{${entries.map(([k, v]) => `${JSON.stringify(k)}:${canonicalStringify(v)}`).join(",")}}`;
  }
  return JSON.stringify(value);
}
```

**(C) Switch signature input to a strictly typed shape** — derive a `path:start-end` joined key string from each evidence entry. Loses some fidelity but is the most stable across LLM whim.

Option A is what I'd ship. It matches the existing `evidenceRefSchema` field set explicitly so a Zod schema change ripples cleanly into the signature.

## Regression test

`src/findings.ts` has no test coverage today. Suggested addition (`src/findings.test.ts`):

```ts
import { describe, it, expect } from "vitest";
import { findingFromOutput } from "./findings.js";

describe("findingFromOutput signature stability", () => {
  it("produces the same signature for evidence with different key order", () => {
    const base = {
      title: "Bug X",
      category: "bug" as const,
      severity: "medium" as const,
      confidence: "high" as const,
      reasoning: "r",
      reproduction: null,
      recommendation: "fix it",
      whyTestsDoNotAlreadyCoverThis: "",
      suggestedRegressionTest: null,
      minimumFixScope: "",
    };
    const e1 = [{ path: "a.ts", startLine: 1, endLine: 5, symbol: null, quote: null }];
    const e2 = [{ endLine: 5, path: "a.ts", symbol: null, quote: null, startLine: 1 }];
    const f1 = findingFromOutput({ ...base, evidence: e1 }, "feat-1", "run-1");
    const f2 = findingFromOutput({ ...base, evidence: e2 }, "feat-1", "run-2");
    expect(f1.signature).toBe(f2.signature);
    expect(f1.findingId).toBe(f2.findingId);
  });
});
```

## Severity

Medium. Doesn't break correctness (each finding is still valid), but defeats the dedup design and quietly degrades the findings inventory over time. Triggered by every re-review against an unchanged feature, which is one of the documented happy-path workflows.

## Found while

Auditing clawpatch source after Derek (OpenClaw user) asked me to "check out current implementation for easy bug." Ran adversarially against `src/`, focused on JSON parsing, ID generation, shell quoting, and git ops. The signature instability is the cleanest small bug I found in this pass — small, exploitable in normal use, untested, fixable in <30 lines.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

findingFromOutput signature is not stable across re-reviews: JSON.stringify(evidence) is key-order-dependent #83

Summary

Repro

Why it matters

Suggested fix

Regression test

Severity

Found while

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

findingFromOutput signature is not stable across re-reviews: JSON.stringify(evidence) is key-order-dependent #83

Description

Summary

Repro

Why it matters

Suggested fix

Regression test

Severity

Found while

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions