Skip to content

fix(generation): stabilize prompt hashes across re-runs#62

Open
dmikushin wants to merge 1 commit intorepowise-dev:mainfrom
dmikushin:fix/stabilize-prompt-hashes
Open

fix(generation): stabilize prompt hashes across re-runs#62
dmikushin wants to merge 1 commit intorepowise-dev:mainfrom
dmikushin:fix/stabilize-prompt-hashes

Conversation

@dmikushin
Copy link
Copy Markdown

Problem

Every repowise init re-generates all wiki pages from scratch, even when the codebase hasn't changed. The root cause is non-deterministic source_hash values: the SHA-256 is computed over the rendered Jinja2 prompt, and two context variables were unstable across runs.

Source of non-determinism 1: graph edge ordering

Files are parsed in parallel via ProcessPoolExecutor + as_completed, so the order in which nodes and edges are inserted into the NetworkX graph is non-deterministic. graph.predecessors() / graph.successors() return nodes in insertion order, so dependents and dependencies lists in FilePageContext shuffled between runs → different rendered prompt → different source_hash.

Fix: sort predecessors/successors before building FilePageContext in ContextAssembler.assemble_file_page.

Source of non-determinism 2: Louvain community IDs

nx.community.louvain_communities already receives seed=42, but the adjacency traversal order inside Louvain still depends on node insertion order (same root cause). Additionally, the community list returned by louvain_communities has no guaranteed order, so enumerate() assigned different integer IDs to the same community across runs.

Fix: before calling louvain_communities, rebuild a sorted copy of the undirected graph (g_stable) with nodes and edges added in alphabetical order. After the call, sort the returned community list by each community's lexicographically smallest member before enumerate().

Impact

These two fixes make source_hash stable across re-runs for unchanged files, enabling the DB content cache (_db_content_cache keyed by source_hash) to skip redundant LLM calls and save API costs.

Testing

Added scripts/diagnose_hash_mismatch.py — a diagnostic script that:

  • Calls betweenness_centrality() and community_detection() twice and reports any differing values
  • For each cached file_page in wiki.db, renders the prompt fresh and compares SHA-256 with the stored source_hash
  • Reports whether a mismatch is caused by dep_summaries or another factor, with a unified diff

Run from the target repo directory:

python3 scripts/diagnose_hash_mismatch.py /path/to/repo --max-pages 20

Checklist

  • context_assembler.py: sort predecessors/successors
  • graph.py: sorted graph copy + sorted community list before enumerate
  • scripts/diagnose_hash_mismatch.py: diagnostic tool

Graph edge ordering and community IDs were non-deterministic because
files are parsed in parallel (ProcessPoolExecutor + as_completed),
causing NetworkX node insertion order to vary between runs.

Changes:
- context_assembler: sort predecessors/successors before including them
  in FilePageContext so dependents/dependencies lists are identical
  across runs regardless of graph construction order
- graph: rebuild a sorted copy of the undirected graph before passing it
  to louvain_communities so adjacency traversal order is reproducible;
  also sort the returned community list by each community's smallest
  member before assigning integer IDs via enumerate()

Adds scripts/diagnose_hash_mismatch.py to verify the fix and identify
any remaining sources of hash instability (dep_summaries, betweenness
sampling, etc.).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmikushin dmikushin force-pushed the fix/stabilize-prompt-hashes branch from da425be to 6fdf4ee Compare April 10, 2026 12:32
@RaghavChamadiya
Copy link
Copy Markdown
Collaborator

Nice analysis and clean fix. The sorted predecessors/successors and stabilized Louvain ordering both make sense, and the PR description is really well written.

A few things before I merge:

  1. Betweenness centrality: for large repos (above the threshold), betweenness_centrality uses k=500 random samples without a seed. If betweenness values feed into the rendered prompt, that's a third source of non-determinism you haven't addressed. Can you check whether that's the case? If so, adding seed=42 there too would complete the fix.

  2. Unit tests: the core changes (sorted edges, sorted communities) don't have tests that verify stability across different insertion orders. Something like building the same graph in two different orders and asserting identical community IDs would really lock this in and catch regressions in CI.

  3. Diagnostic script: scripts/diagnose_hash_mismatch.py imports private internals like _run_ingestion which will break on refactors. Fine to keep it as an unsupported debugging tool in scripts/, but a proper unit test would be more valuable long term.

Happy to merge once (1) and (2) are addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants