perf: add lru cache for load_executed_tipset to speed up hot queries#6761
perf: add lru cache for load_executed_tipset to speed up hot queries#6761hanabi1224 wants to merge 7 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAlways-load executed tipsets with events, add a LazyLock-backed SizeTrackingLruCache for ExecutedTipset, introduce ExecutedMessage/ExecutedTipset with heap-size accounting, remove the no-events load API, update VM event handling and RPC call sites to use the unified load_executed_tipset, and add a vec heap-size helper. Changes
Sequence DiagramsequenceDiagram
participant RPC as RPC Method
participant SM as State Manager
participant Cache as LRU Cache
participant DB as Database
participant VM as FVM / VM
RPC->>SM: load_executed_tipset(tipset_key)
SM->>Cache: lookup(tipset_key)
alt cache hit
Cache-->>SM: ExecutedTipset
else cache miss
SM->>DB: fetch messages & receipts
SM->>VM: execute/collect receipts & events (when events_root present)
VM-->>SM: receipts, events (or None), state_root, receipt_root
SM->>SM: build ExecutedMessage[] and ExecutedTipset
SM->>Cache: insert(tipset_key, ExecutedTipset)
end
SM-->>RPC: ExecutedTipset
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)
553-572: Consider restoring a no-events fast path for callers that don't consume events.
load_executed_tipset_innernow always attempts event AMT loading whenevents_rootexists. Callers like block construction and fee history only need message/receipt data, so this adds avoidable I/O and heap pressure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/state_manager/mod.rs` around lines 553 - 572, The current code in load_executed_tipset_inner always loads the events AMT when receipt.events_root() is Some, causing unnecessary I/O/alloc for callers that don't need events; modify load_executed_tipset_inner (and its callers) to accept a boolean flag (e.g., load_events or need_events) and short-circuit the events branch when false: if load_events is false, set events = None without calling StampedEvent::get_events or triggering compute_tipset_state; otherwise keep the existing logic using receipt.events_root(), StampedEvent::get_events, recomputed, and compute_tipset_state with msg_ts/NO_CALLBACK/VMTrace::NotTraced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/shim/executor.rs`:
- Around line 82-84: The current #[derive(Deserialize)] on the enum Receipt
makes untagged deserialization ambiguous and allows direct deserialization that
can silently drop fields (e.g., V3 with events_root: null matching V2); remove
Deserialize from the derive list on Receipt (leave Clone, Debug, Serialize) and
add a short comment above Receipt pointing users to use get_parent_receipt() for
constructing Receipt variants to avoid ambiguous direct deserialization.
In `@src/state_manager/mod.rs`:
- Around line 496-511: The function-level static CACHE (SizeTrackingLruCache) in
load_executed_tipset creates a process-global cache shared across all
StateManager instances and ChainStore backends; replace it with a
per-StateManager cache or include ChainStore identity in the cache key.
Concretely: remove the static CACHE, add a SizeTrackingLruCache<TipsetKey,
ExecutedTipset> field (e.g., cache) to the StateManager struct and initialize it
when StateManager is constructed, then change load_executed_tipset to call
self.cache.get_cloned(...) and self.cache.push(...); alternatively, if you must
keep a global cache, incorporate a stable ChainStore identifier (or pointer
fingerprint) into the cache key used by CACHE.get_cloned and CACHE.push so
entries are segregated by ChainStore. Ensure all references to CACHE in
load_executed_tipset and related helpers (e.g., load_executed_tipset_inner) are
updated to use the instance field or the augmented key.
---
Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 553-572: The current code in load_executed_tipset_inner always
loads the events AMT when receipt.events_root() is Some, causing unnecessary
I/O/alloc for callers that don't need events; modify load_executed_tipset_inner
(and its callers) to accept a boolean flag (e.g., load_events or need_events)
and short-circuit the events branch when false: if load_events is false, set
events = None without calling StampedEvent::get_events or triggering
compute_tipset_state; otherwise keep the existing logic using
receipt.events_root(), StampedEvent::get_events, recomputed, and
compute_tipset_state with msg_ts/NO_CALLBACK/VMTrace::NotTraced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 4300eb72-d86a-49a8-ba79-044ab82d051c
📒 Files selected for processing (4)
src/rpc/methods/eth.rssrc/shim/executor.rssrc/state_manager/mod.rssrc/utils/get_size/mod.rs
Codecov Report❌ Patch coverage is Additional details and impacted files
... and 13 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
…forest into hm/cache-load_executed_tipset
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/state_manager/mod.rs (1)
557-571:⚠️ Potential issue | 🟡 MinorAdd context to event-load failures for diagnosability.
On Line 561 and Line 570, the propagated errors from
StampedEvent::get_eventslose tipset/root context, which makes storage/debug failures harder to triage.Proposed patch
- match StampedEvent::get_events(self.cs.blockstore(), &events_root) { + match StampedEvent::get_events(self.cs.blockstore(), &events_root) + .with_context(|| { + format!( + "failed to load events (tipset={}, events_root={events_root})", + msg_ts.key() + ) + }) { Ok(events) => events, Err(e) if recomputed => return Err(e), Err(_) => { self.compute_tipset_state( msg_ts.clone(), NO_CALLBACK, VMTrace::NotTraced, ) .await?; recomputed = true; - StampedEvent::get_events(self.cs.blockstore(), &events_root)? + StampedEvent::get_events(self.cs.blockstore(), &events_root) + .with_context(|| { + format!( + "failed to load events after recompute (tipset={}, events_root={events_root})", + msg_ts.key() + ) + })? } },As per coding guidelines: “Use
anyhow::Result<T>for most operations and add context with.context()when errors occur”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/state_manager/mod.rs` around lines 557 - 571, The calls to StampedEvent::get_events are propagating raw errors without context; update both call sites (the initial match arm and the fallback after compute_tipset_state) to attach context via anyhow::Context (e.g., .context(...)) that includes the tipset identifier (msg_ts or its tipset key) and the events_root value so failures indicate which tipset/root failed; keep the recomputed logic intact, but return errors with .context(...) instead of plain propagation for StampedEvent::get_events in this function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/state_manager/mod.rs`:
- Around line 557-571: The calls to StampedEvent::get_events are propagating raw
errors without context; update both call sites (the initial match arm and the
fallback after compute_tipset_state) to attach context via anyhow::Context
(e.g., .context(...)) that includes the tipset identifier (msg_ts or its tipset
key) and the events_root value so failures indicate which tipset/root failed;
keep the recomputed logic intact, but return errors with .context(...) instead
of plain propagation for StampedEvent::get_events in this function.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a9cb5363-c499-43a7-9848-80e5bd8d12bd
📒 Files selected for processing (4)
src/chain/store/index.rssrc/interpreter/vm.rssrc/rpc/methods/eth.rssrc/state_manager/mod.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/rpc/methods/eth.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)
137-142: Minor documentation improvement.The comment mentions
state_roothaving no heap allocation but omitsreceipt_root, which is also aCidwith no heap allocation.📝 Suggested comment improvement
impl GetSize for ExecutedTipset { fn get_heap_size(&self) -> usize { - // state_root(Cid) has no heap allocation, so we only calculate the heap size of executed_messages + // state_root and receipt_root (both Cid) have no heap allocation, so we only calculate the heap size of executed_messages vec_heap_size_helper(&self.executed_messages) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/state_manager/mod.rs` around lines 137 - 142, Update the comment in the GetSize impl for ExecutedTipset (fn get_heap_size) to mention that both state_root and receipt_root are Cid values with no heap allocation; keep the rest of the behavior the same and continue to compute heap size only for executed_messages (vec_heap_size_helper(&self.executed_messages)).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 137-142: Update the comment in the GetSize impl for ExecutedTipset
(fn get_heap_size) to mention that both state_root and receipt_root are Cid
values with no heap allocation; keep the rest of the behavior the same and
continue to compute heap size only for executed_messages
(vec_heap_size_helper(&self.executed_messages)).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5e006ea4-1917-491d-8afe-1c946e3196f6
📒 Files selected for processing (1)
src/state_manager/mod.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)
133-134: Consider removing#[allow(dead_code)]when the field is used.The
receipt_rootfield is currently unused (#[allow(dead_code)]). If this is added for future use or API completeness, consider adding a brief doc comment explaining the intent, or remove the attribute once consumers start using it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/state_manager/mod.rs` around lines 133 - 134, The field `receipt_root: Cid` in this module is annotated with #[allow(dead_code] — either remove that attribute if the field is now used downstream, or add a short doc comment on the `receipt_root` field explaining its intended future use/API completeness (so the attribute is justified); locate the declaration of `pub receipt_root: Cid` in mod.rs and either delete the #[allow(dead_code)] attribute or replace it with a doc comment like "/// Receipt Merkle root, kept for API completeness / future use" to clarify intent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 133-134: The field `receipt_root: Cid` in this module is annotated
with #[allow(dead_code] — either remove that attribute if the field is now used
downstream, or add a short doc comment on the `receipt_root` field explaining
its intended future use/API completeness (so the attribute is justified); locate
the declaration of `pub receipt_root: Cid` in mod.rs and either delete the
#[allow(dead_code)] attribute or replace it with a doc comment like "/// Receipt
Merkle root, kept for API completeness / future use" to clarify intent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: bc9e17a9-8758-4045-a5f9-7c6305f86a78
📒 Files selected for processing (1)
src/state_manager/mod.rs
Summary of changes
This PR adds lru cache for load_executed_tipset to speed up hot queries
k6 benchmark results:
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes #5632
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit
Refactor
Performance
Behavioral
Public API
Utilities