Skip to content

feat: Deferred block proving#1725

Open
sergerad wants to merge 36 commits intonextfrom
sergerad-deferred-block-proving
Open

feat: Deferred block proving#1725
sergerad wants to merge 36 commits intonextfrom
sergerad-deferred-block-proving

Conversation

@sergerad
Copy link
Collaborator

@sergerad sergerad commented Mar 2, 2026

Context

We are adding deferred (asynchronous) block proving for the node, as described #1592. Currently, block proving happens synchronously during apply_block, which means block commitment is blocked until the proof is generated.

Blocks will now exhibit committed (not yet proven) and proven states. A committed block is already part of the canonical chain and fully usable. Clients that require proof-level finality can opt into it via the new finality parameter on SyncChainMmr.

Changes

  • Schema: Added block_proof BLOB and proving_inputs BLOB columns to the block_headers table.
  • DB queries: Added query functions for insert_block_proof, select_block_proof, select_unproven_blocks (excludes genesis block 0), select_block_proving_inputs, and select_latest_proven_block_num.
  • Decoupled proving from apply_block: The BlockProofRequest is now serialized and persisted alongside the block during apply_block.
  • Proof scheduler: Added a background task (proof_scheduler.rs) that drives deferred proving. It queries unproven blocks on startup (restart recovery), listens for new block commits via Notify, and proves blocks concurrently using FuturesOrdered for FIFO completion ordering.
  • Finality on SyncChainMmr: Added a Finality enum (COMMITTED, PROVEN) to the protobuf and a finality field on SyncChainMmrRequest.
  • Refactored apply_block query: Introduced ApplyBlockData struct to replace the 7-parameter function signature.

@sergerad sergerad changed the title Sergerad deferred block proving feat: Deferred block proving Mar 2, 2026
@sergerad sergerad marked this pull request as ready for review March 2, 2026 23:42
@sergerad sergerad requested review from Mirko-von-Leipzig, bobbinth and drahnr and removed request for Mirko-von-Leipzig March 2, 2026 23:43
signature BLOB NOT NULL,
commitment BLOB NOT NULL,
proving_inputs BLOB, -- Serialized BlockProofRequest needed for deferred proving. NULL for genesis block.
block_proof BLOB, -- NULL means the block has not yet been proven
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe fine for now, but eventually, we probably shouldn't store block proofs in the database as each of them could be close to 200KB.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to store each proof as a single file?

block_header BLOB NOT NULL,
signature BLOB NOT NULL,
commitment BLOB NOT NULL,
proving_inputs BLOB, -- Serialized BlockProofRequest needed for deferred proving. NULL for genesis block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for storing proving inputs in the database?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big are these?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iiuc conceptually, we retrieve all un-proven blocks from DB, and to request a proof we require the inputs. The column becomes uniform source for inputs when it comes to proofing for both in-flight and proving on-startup. It also ensure we persist the inputs/don't lose any information on crash / host reboot / etc.

If retrieving/creating a proof fails repeatedly, we might want a mitigation to get stuck in trying to proof a failing one. (I'll get back to this).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the proof scheduler needs to gather proving inputs on startup / in loop when proving committed blocks.

Do we want to factor this differently so that the proof scheduler has to construct the proving inputs itself? @bobbinth

Comment on lines 501 to +507
BlockRange block_range = 1;

// The finality level to use when clamping the upper bound of the block range.
//
// When set to FINALITY_COMMITTED (default), the upper bound is clamped to the chain tip.
// When set to FINALITY_PROVEN, the upper bound is clamped to the latest proven block.
Finality finality = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block_range and finality could be set inconsistently, right? For example, if the last committed block is 100, and last proven block is 105, the request could be from = 90, to = 103, finality = committed and this could create some ambiguity (e.g., what takes precedence - block_to or finality?).

A different approach could be to make it so that we can either specify block_to or the finality value - but not both.

Copy link
Collaborator Author

@sergerad sergerad Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the last committed block is 100, and last proven block is 105

A proven block could never be higher than the latest committed block.

A different approach could be to make it so that we can either specify block_to or the finality value - but not both.

I think the main use case is precisely to be able to get a range of blocks with a specified finality.

I think the endpoint should either error out or provide best effort results for a range that can't be satisfied whether its a committed or proven finality request.

store_client: &StoreClient,
data_directory: DataDirectory,
accounts_filepath: PathBuf,
signer: &EcdsaSecretKey,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this always an Ecdsa key type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. As opposed to what?

let block_nums: Vec<i64> =
SelectDsl::select(schema::block_headers::table, schema::block_headers::block_num)
.filter(schema::block_headers::block_proof.is_null())
.filter(schema::block_headers::block_num.gt(0i64))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q; are there proven blocks at genesis?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genesis block is never proven.

// ================================================================================================

/// Overall timeout for proving a single block.
const BLOCK_PROVE_TIMEOUT: Duration = Duration::from_mins(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make this a bit more relaxed, particularly for CI runs 🫠

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped to 4. Leaving comment here for others to weigh in

continue;
},
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is racy and we need atomicity in general. The current way of calling works. I'd appreciate a comment on why this works (we have the outer Mutex<()> to protect both)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the logic to construct the notify before the db query. See the comment above that line. LMK if my understanding is incorrect.

};
futures.push_back(fut);
}
futures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FuturesOrdered doesn't have a concurrency execution limit, it only guarantees the results are in the input order (FIFO). So https://docs.rs/futures/latest/futures/stream/trait.StreamExt.html#method.buffered might a strictly better choice, to limit the optimistic proving.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with adding a batch limit for the db query instead

Copy link
Contributor

@drahnr drahnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally like the approach/LGTM, a few edges to be smoothed

@sergerad sergerad requested a review from drahnr March 3, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants