Skip to content

test(flatkv): Expands docker-level FlatKV coverage #3417

Open
blindchaser wants to merge 7 commits into
mainfrom
yiren/docker-coverage
Open

test(flatkv): Expands docker-level FlatKV coverage #3417
blindchaser wants to merge 7 commits into
mainfrom
yiren/docker-coverage

Conversation

@blindchaser
Copy link
Copy Markdown
Contributor

@blindchaser blindchaser commented May 11, 2026

Summary

Expands docker-level FlatKV coverage across lattice-enabled, post-import, crash, state-sync, and local data-loss recovery paths. Adds an offline seidb import-flatkv-from-memiavl path that rebuilds FlatKV EVM state from memiavl, hardens the FlatKV importer against partial finalization, and he import command is a DR / test-seeding primitive; it preserves the production MigrationManager rollout boundary instead of replacing it.

  • sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl.go: introduces import-flatkv-from-memiavl. It accepts only evm, rejects stale or future heights relative to memiavl.GetLatestVersion, refuses to overwrite committed FlatKV without --force, passes source module names to AddModule, polls importer errors between exporter reads, and aborts on failure so partial imports never finalize or write a snapshot.
  • sei-db/tools/cmd/seidb/main.go: registers the import command under seidb.
  • sei-db/state_db/sc/flatkv/import_translator.go: adds ImportTranslator, which maps memiavl changesets into FlatKV physical rows. It routes storage/code/legacy/non-EVM keys, buffers nonce/codehash account fragments across batches, emits one merged account row per address on Finalize, and returns ErrImportTranslatorFinalized after finalization.
  • sei-db/state_db/sc/flatkv/importer.go: makes KVImporter.Close idempotent, adds Abort(reason) to drain without FinalizeImport / WriteSnapshot, exposes Err() for fail-fast callers, and stores the test flush hook through atomic.Pointer to avoid future race-detector failures.
  • sei-db/state_db/sc/flatkv/wal_torn_write_test.go: adds flatkv-level WAL torn-write recovery coverage. It proves corrupted trailing WAL entries are discarded, clean WAL tails replay to the last appended commit, and recovered LtHash matches the last durable state.
  • integration_test/contracts/deploy_flatkv_evm_fixture.sh: deploys an EVM fixture and persists expected balance, storage, code, and missing-key observations for pre/post import RPC checks.
  • integration_test/contracts/import_flatkv_evm_cluster.sh: stops the devnet, imports EVM state into FlatKV on every validator, restarts in post-import dual_write with evm-ss-split=false and sc-enable-lattice-hash=false, and gates subsequent checks on block progress plus EVM RPC readiness.
  • integration_test/contracts/verify_cross_validator_flatkv_digest.sh: asserts sc-enable-lattice-hash=true in the from-genesis GIGA path and compares all FlatKV buckets (account, code, storage, legacy) at a shared committed height using dump-flatkv --height WAL replay.
  • integration_test/contracts/verify_statesync_flatkv_digest.sh: asserts lattice hash is enabled and compares donor vs state-sync receiver full-bucket FlatKV digests at a shared committed height.
  • integration_test/contracts/verify_flatkv_crash_recovery.sh: SIGKILLs one validator, verifies survivor progress, restarts the process, waits for catch-up, and compares canonical EVM buckets (account, code, storage) in the post-import lattice-disabled path.
  • integration_test/contracts/verify_flatkv_statesync_crash_recovery.sh: wipes one validator, starts state-sync, injects a SIGKILL during restore, restarts, waits for catch-up, and verifies canonical EVM bucket parity.
  • integration_test/contracts/verify_flatkv_total_loss_recovery.sh: simulates total local state loss, restores the validator via state-sync while preserving validator signing state, and verifies canonical EVM bucket parity.
  • integration_test/contracts/verify_flatkv_partial_loss_fails_loudly.sh: deletes only the FlatKV directory and requires either a clear startup failure or a fully self-healed canonical EVM digest match.
  • integration_test/contracts/verify_flatkv_evm_store.sh: dumps the storage bucket, verifies the fixture contract row, and documents the expected 41-byte serialized storage value layout.
  • integration_test/seidb/flatkv_evm_test.yaml: verifies historical and latest EVM balance, storage, and code reads before and after import, checks missing account/storage behavior, and confirms non-EVM modules remain queryable.
  • integration_test/contracts/AGENTS.md: documents the shared FlatKV Integration CI row, step ordering, destructive-test self-healing requirements, and the distinction between lattice-enabled full-bucket checks and post-import canonical EVM checks.
  • .github/workflows/integration-test.yml: keeps FlatKV-specific docker coverage under one FlatKV Integration row and appends crash, state-sync crash, total-loss, and partial-loss scenarios after import verification. It also runs full-bucket lattice-enabled digest checks inside Chain Operation Test.
  • sei-db/state_db/sc/migration/OPERATIONS.md: documents the MigrateEVM operational model, non-atomic memiavl/FlatKV write risks, failure modes, detection signals, recovery paths, and future tooling roadmap.

Test plan

  • sei-db/state_db/sc/flatkv/importer_test.go: covers Close idempotency on success and error, first-error-wins semantics, non-blocking AddNode after done, multi-flush large imports, abort-without-finalize behavior, abort after close, and explicit backpressure through the atomic flush hook.
  • sei-db/state_db/sc/flatkv/import_translator_test.go: covers nil/empty changesets, storage/code/legacy/non-EVM routing, nonce and codehash buffering, same-call and cross-call account merges, delete dropping, malformed key/value rejection, mixed storage/account batches, Finalize idempotency, and the Translate after Finalize sentinel error.
  • sei-db/state_db/sc/flatkv/wal_torn_write_test.go: covers mid-record truncation, partial length-prefix truncation, and clean WAL tails, asserting recovery to the last good version or replay to the last clean WAL commit with LtHash consistency.
  • sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl_test.go: covers end-to-end EVM encoding through the CLI path, --force overwrite safety, height range checks, stale-height and future-height rejection, untouched FlatKV on rejection, and a large dataset that exercises importer flushes, worker backpressure, and cross-batch account merging.
  • Docker integration: Chain Operation Test verifies sc-enable-lattice-hash=true, state-sync donor/receiver full-bucket digest equality, and cross-validator full-bucket digest equality at committed heights. FlatKV Integration verifies fixture RPC reads before and after offline import, physical storage layout, SIGKILL restart recovery, state-sync mid-flight crash recovery, total-loss state-sync recovery, and FlatKV-only partial-loss failure behavior with diagnostic log dumps on failure.

Note

Medium Risk
Medium risk because it changes FlatKV import finalization semantics (KVImporter.Close/Abort) and adds a new offline import path that rewrites on-disk FlatKV state; failures could affect recovery/migration workflows and CI reliability.

Overview
Expands FlatKV docker-level CI coverage by adding a dedicated FlatKV Integration workflow row plus additional lattice-hash digest checks after state-sync, and by running a new set of destructive scenarios (SIGKILL crash recovery, state-sync crash recovery, total-loss recovery, partial-loss behavior) with new helper scripts and an EVM fixture-driven RPC probe.

Introduces an offline recovery/migration primitive seidb import-flatkv-from-memiavl (and memiavl-latest-version) to rebuild FlatKV evm state from a stopped node’s memiavl at a specific height, with safeguards (--force for overwrites, reject stale/future heights) and batching via a new flatkv.ImportTranslator that converts memiavl changesets into FlatKV physical rows.

Hardens the FlatKV importer lifecycle by making KVImporter.Close() idempotent, exposing Err() for fail-fast detection, and adding Abort(reason) to drain workers without FinalizeImport/WriteSnapshot so partial imports don’t commit; adds extensive unit tests for importer concurrency/backpressure and WAL torn-write recovery, plus an operational MigrateEVM failure-mode/tooling roadmap document.

Reviewed by Cursor Bugbot for commit 52870bd. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 12, 2026, 10:51 PM

@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

❌ Patch coverage is 55.82822% with 144 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.24%. Comparing base (26636ba) to head (52870bd).

Files with missing lines Patch % Lines
...cmd/seidb/operations/import_flatkv_from_memiavl.go 41.74% 102 Missing and 18 partials ⚠️
sei-db/state_db/sc/flatkv/import_translator.go 77.77% 8 Missing and 8 partials ⚠️
sei-db/state_db/sc/flatkv/importer.go 86.95% 4 Missing and 2 partials ⚠️
sei-db/tools/cmd/seidb/main.go 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3417      +/-   ##
==========================================
- Coverage   59.34%   59.24%   -0.11%     
==========================================
  Files        2112     2112              
  Lines      174772   174437     -335     
==========================================
- Hits       103724   103338     -386     
- Misses      62010    62146     +136     
+ Partials     9038     8953      -85     
Flag Coverage Δ
sei-chain-pr 55.85% <55.82%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/tools/cmd/seidb/main.go 0.00% <0.00%> (ø)
sei-db/state_db/sc/flatkv/importer.go 90.12% <86.95%> (+2.28%) ⬆️
sei-db/state_db/sc/flatkv/import_translator.go 77.77% <77.77%> (ø)
...cmd/seidb/operations/import_flatkv_from_memiavl.go 41.74% <41.74%> (ø)

... and 52 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00487eff98

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread sei-db/tools/cmd/seidb/operations/import_flatkv_from_memiavl.go
…verage

`seidb import-flatkv-from-memiavl` copies a memiavl module (evm only) into
FlatKV at a given height. Adds the CLI + translator, KVImporter
Abort/Close-idempotent for partial-commit safety, unit tests for the
concurrency pipeline + a 50K-pair E2E, and a docker CI job asserting
post-import RPC reads match pre-import values.

Offline DR / test-seeding tool only; does NOT exercise the MigrationManager
(V0 -> V1) production path, and the post-import smoke is RPC-sample-level
not a full set comparison.

Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread integration_test/contracts/verify_statesync_flatkv_digest.sh Outdated
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from 10c93b3 to 0414b90 Compare May 11, 2026 15:15
…ash recovery

- verify_statesync_flatkv_digest.sh: donor↔rpc-node dump-flatkv sha256 match
  at a chain height both nodes have committed, replacing the existing
  height-only state-sync probe.
- verify_cross_validator_flatkv_digest.sh: all 4 validators must agree on
  flatkv physical content at a common committed chain height, catching
  silent drift the AppHash path cannot.
- verify_flatkv_crash_recovery.sh: SIGKILL a validator mid block production,
  restart it, wait for catch-up, then 4-way digest match at a common
  committed chain height.
- Wire the first two into the GIGA "Chain Operation Test" job; add a new
  GIGA "FlatKV Crash Recovery" job for the SIGKILL test.

Compares via chain height + dump-flatkv WAL-replay (which works regardless
of flatkv SnapshotInterval) rather than intersecting flatkv snapshot dirs,
so the checks remain meaningful on CI devnets that never reach the default
SnapshotInterval=10000 block boundary.
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from 0414b90 to f46dc25 Compare May 11, 2026 15:46
Comment thread integration_test/contracts/verify_cross_validator_flatkv_digest.sh
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch 2 times, most recently from ab3edcd to f89deef Compare May 12, 2026 05:29
@blindchaser blindchaser force-pushed the yiren/docker-coverage branch from f89deef to c59c4dc Compare May 12, 2026 05:39
@blindchaser blindchaser changed the title test(flatkv): increase docker ci coverage test(flatkv): Expands docker-level FlatKV coverage May 12, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice!

docker exec "$victim" bash -lc "
set -euo pipefail
peers=\$(grep -v '^$' /sei-protocol/sei-chain/build/generated/persistent_peers.txt | paste -sd ',' -)
sed -i.bak -e 's|^enable *=.*|enable = true|' /root/.sei/config/config.toml
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad sed pattern enables all matching config keys

Low Severity

The sed pattern 's|^enable *=.*|enable = true|' applied to config.toml matches every line starting with enable, not just the [statesync] section's enable key. If config.toml has other sections with an enable key (e.g. [fastsync] or future sections), those would also be set to true unintentionally. A more targeted pattern anchored to the statesync section, or a TOML-aware edit, would avoid silently enabling unrelated features.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c3f2e79. Configure here.

}
}
_ = importer.Close()
}()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred cleanup finalizes partial import on type assertion failure

Medium Severity

The deferred importer cleanup has a control flow gap: when the named return err is non-nil but the type assertion importer.(*flatkv.KVImporter) fails, execution falls through to importer.Close(). This would finalize a partial import instead of aborting it. The return on line 235 is inside the inner if ok block, so the outer if err != nil block exits normally and reaches the unconditional Close() call. A return (or moving Close into an else) is needed after the inner if block to prevent finalization on error.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 901f80b. Configure here.

echo "Importing evm module from memiavl into FlatKV on all validators..."
for i in $(seq 0 $((NODE_COUNT - 1))); do
docker exec "sei-node-$i" bash -lc "cd /sei-protocol/sei-chain && build/seidb import-flatkv-from-memiavl --modules=evm --data-dir /root/.sei/data"
done
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seidb binary built on one node, import runs on all

Medium Severity

The seidb binary is compiled only inside sei-node-0 (line 70), but the import loop on lines 95–97 executes build/seidb inside every validator container. All other verification scripts in this PR use an ensure_seidb helper that checks and builds the binary on each node independently. If the build output directory is not a shared Docker volume, sei-node-1 through sei-node-3 would fail with a missing binary.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 901f80b. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 5 total unresolved issues (including 3 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 52870bd. Configure here.

sed -i.bak -e 's|^enable *=.*|enable = true|' /root/.sei/config/config.toml
sed -i.bak -e 's|^rpc-servers *=.*|rpc-servers = \"${DONOR_NODE}:26657,${SECOND_RPC_NODE}:26657\"|' /root/.sei/config/config.toml
sed -i.bak -e 's|^trust-height *=.*|trust-height = ${trust_height}|' /root/.sei/config/config.toml
sed -i.bak -e 's|^trust-hash *=.*|trust-hash = \"${trust_hash}\"|' /root/.sei/config/config.toml
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State-sync sed patterns may use wrong key format

Medium Severity

The configure_statesync sed commands use hyphenated key names (rpc-servers, trust-height, trust-hash) but standard CometBFT/Tendermint config.toml templates generate these keys with underscores (rpc_servers, trust_height, trust_hash). If sei-tendermint follows the upstream convention, the sed replacements would silently fail to match, leaving state-sync unconfigured with empty rpc_servers, zero trust_height, and empty trust_hash. The victim node would then fail to state-sync, causing the test to time out rather than exercise the intended crash-recovery path.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 52870bd. Configure here.

}
imp.setErr(reason)
return imp.Close()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abort after successful Close corrupts Err state

Low Severity

Abort unconditionally calls setErr(reason) before delegating to Close. If Close already completed successfully (with firstErr still nil), the CompareAndSwap in setErr succeeds, poisoning firstErr with the abort reason. After that, Err() returns the abort reason while Close() returns the cached nil — the two disagree. While the current CLI tool never hits this path because the defer only calls Abort when err != nil, any future caller querying Err() after a successful Close + stale Abort would see a spurious error.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 52870bd. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants