Skip to content

perf: add lru cache for load_executed_tipset to speed up hot queries#6761

Open
hanabi1224 wants to merge 7 commits intomainfrom
hm/cache-load_executed_tipset
Open

perf: add lru cache for load_executed_tipset to speed up hot queries#6761
hanabi1224 wants to merge 7 commits intomainfrom
hm/cache-load_executed_tipset

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Mar 18, 2026

Summary of changes

This PR adds lru cache for load_executed_tipset to speed up hot queries

k6 benchmark results:

# Forest
❯ k6 run -e K6_RPC_URL=http://localhost:2345/rpc/v1 -e K6_METHOD=eth_getBlockReceipts tests/single_method.js --duration 30s --vus 20

  █ TOTAL RESULTS

    checks_total.......: 31352   1043.529163/s
    checks_succeeded...: 100.00% 31352 out of 31352
    checks_failed......: 0.00%   0 out of 31352

    ✓ is status 200
    ✓ is JSON-RPC without error

    HTTP
    http_req_duration..............: avg=35.78ms min=17.74ms med=33.08ms max=142.77ms p(90)=51.77ms p(95)=58.63ms
      { expected_response:true }...: avg=35.78ms min=17.74ms med=33.08ms max=142.77ms p(90)=51.77ms p(95)=58.63ms
    http_req_failed................: 0.00%  0 out of 15676
    http_reqs......................: 15676  521.764581/s

    EXECUTION
    iteration_duration.............: avg=38.26ms min=18.21ms med=35.78ms max=156.88ms p(90)=54.76ms p(95)=61.74ms
    iterations.....................: 15676  521.764581/s
    vus............................: 20     min=20         max=20
    vus_max........................: 20     min=20         max=20

    NETWORK
    data_received..................: 515 MB 17 MB/s
    data_sent......................: 3.2 MB 108 kB/s




running (0m30.0s), 00/20 VUs, 15676 complete and 0 interrupted iterations
default ✓ [======================================] 20 VUs  30s

# Lotus
❯ k6 run -e K6_RPC_URL=http://localhost:1234/rpc/v1 -e K6_METHOD=eth_getBlockReceipts tests/single_method.js --duration
30s --vus 20

  █ TOTAL RESULTS

    checks_total.......: 28388   945.546539/s
    checks_succeeded...: 100.00% 28388 out of 28388
    checks_failed......: 0.00%   0 out of 28388

    ✓ is status 200
    ✓ is JSON-RPC without error

    HTTP
    http_req_duration..............: avg=41.27ms min=19.35ms med=39.48ms max=93.43ms p(90)=52.73ms p(95)=55.69ms
      { expected_response:true }...: avg=41.27ms min=19.35ms med=39.48ms max=93.43ms p(90)=52.73ms p(95)=55.69ms
    http_req_failed................: 0.00%  0 out of 14194
    http_reqs......................: 14194  472.773269/s

    EXECUTION
    iteration_duration.............: avg=42.26ms min=20.75ms med=40.49ms max=93.95ms p(90)=53.77ms p(95)=56.72ms
    iterations.....................: 14194  472.773269/s
    vus............................: 20     min=20         max=20
    vus_max........................: 20     min=20         max=20

    NETWORK
    data_received..................: 466 MB 16 MB/s
    data_sent......................: 2.9 MB 98 kB/s




running (0m30.0s), 00/20 VUs, 14194 complete and 0 interrupted iterations
default ✓ [======================================] 20 VUs  30s

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes #5632

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Refactor

    • Unified executed-tipset loading into a single path and removed the separate "without events" route.
  • Performance

    • Added a global, size-tracking in-memory cache for executed-tipset data to speed repeated lookups.
  • Behavioral

    • Execution now always provides receipt roots; per-message events are included when present and per-block events are optional.
  • Public API

    • Exposed executed-tipset and executed-message types with heap-size accounting to support caching.
  • Utilities

    • Added a helper to compute heap size for vectors of sized items.

@hanabi1224 hanabi1224 added the RPC requires calibnet RPC checks to run on CI label Mar 18, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Always-load executed tipsets with events, add a LazyLock-backed SizeTrackingLruCache for ExecutedTipset, introduce ExecutedMessage/ExecutedTipset with heap-size accounting, remove the no-events load API, update VM event handling and RPC call sites to use the unified load_executed_tipset, and add a vec heap-size helper.

Changes

Cohort / File(s) Summary
State manager & caching
src/state_manager/mod.rs
Add ExecutedMessage and ExecutedTipset (pub, Debug, Clone, GetSize); introduce executed_tipset_cache() as a LazyLock SizeTrackingLruCache; remove LoadExecutedTipsetOptions and load_executed_tipset_without_events; make load_executed_tipset always load events and consult/populate cache; push ExecutedTipset into cache after apply.
RPC callers
src/rpc/methods/eth.rs
Replace calls to load_executed_tipset_without_events(...) with load_executed_tipset(...).await? in Block::from_filecoin_tipset, get_block_receipts, and eth_fee_history.
VM apply & types
src/interpreter/vm.rs
Change ApplyBlockResult events type to Vec<Option<Vec<StampedEvent>>>; update apply_block_messages to push Some(events) only when the block's events root is present (otherwise push None).
Utilities & small fixes
src/utils/get_size/mod.rs, src/chain/store/index.rs
Add vec_heap_size_helper<T: GetSize>(&Vec<T>) -> usize; change cache size init to nonzero!(20480_usize) for tipset_by_height.

Sequence Diagram

sequenceDiagram
    participant RPC as RPC Method
    participant SM as State Manager
    participant Cache as LRU Cache
    participant DB as Database
    participant VM as FVM / VM

    RPC->>SM: load_executed_tipset(tipset_key)
    SM->>Cache: lookup(tipset_key)
    alt cache hit
        Cache-->>SM: ExecutedTipset
    else cache miss
        SM->>DB: fetch messages & receipts
        SM->>VM: execute/collect receipts & events (when events_root present)
        VM-->>SM: receipts, events (or None), state_root, receipt_root
        SM->>SM: build ExecutedMessage[] and ExecutedTipset
        SM->>Cache: insert(tipset_key, ExecutedTipset)
    end
    SM-->>RPC: ExecutedTipset
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • akaladarshi
  • sudo-shashank
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding an LRU cache for load_executed_tipset to improve performance of hot queries.
Linked Issues check ✅ Passed The PR successfully implements caching for load_executed_tipset to address the performance bottleneck in eth_getBlockReceipts reported in #5632, with benchmark improvements demonstrated.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the LRU cache solution: cache infrastructure, ExecutedTipset/ExecutedMessage public types, event handling in apply_block_messages, and supporting utilities.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/cache-load_executed_tipset
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/cache-load_executed_tipset
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review March 18, 2026 14:02
@hanabi1224 hanabi1224 requested a review from a team as a code owner March 18, 2026 14:02
@hanabi1224 hanabi1224 requested review from akaladarshi and sudo-shashank and removed request for a team March 18, 2026 14:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)

553-572: Consider restoring a no-events fast path for callers that don't consume events.

load_executed_tipset_inner now always attempts event AMT loading when events_root exists. Callers like block construction and fee history only need message/receipt data, so this adds avoidable I/O and heap pressure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state_manager/mod.rs` around lines 553 - 572, The current code in
load_executed_tipset_inner always loads the events AMT when
receipt.events_root() is Some, causing unnecessary I/O/alloc for callers that
don't need events; modify load_executed_tipset_inner (and its callers) to accept
a boolean flag (e.g., load_events or need_events) and short-circuit the events
branch when false: if load_events is false, set events = None without calling
StampedEvent::get_events or triggering compute_tipset_state; otherwise keep the
existing logic using receipt.events_root(), StampedEvent::get_events,
recomputed, and compute_tipset_state with msg_ts/NO_CALLBACK/VMTrace::NotTraced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/shim/executor.rs`:
- Around line 82-84: The current #[derive(Deserialize)] on the enum Receipt
makes untagged deserialization ambiguous and allows direct deserialization that
can silently drop fields (e.g., V3 with events_root: null matching V2); remove
Deserialize from the derive list on Receipt (leave Clone, Debug, Serialize) and
add a short comment above Receipt pointing users to use get_parent_receipt() for
constructing Receipt variants to avoid ambiguous direct deserialization.

In `@src/state_manager/mod.rs`:
- Around line 496-511: The function-level static CACHE (SizeTrackingLruCache) in
load_executed_tipset creates a process-global cache shared across all
StateManager instances and ChainStore backends; replace it with a
per-StateManager cache or include ChainStore identity in the cache key.
Concretely: remove the static CACHE, add a SizeTrackingLruCache<TipsetKey,
ExecutedTipset> field (e.g., cache) to the StateManager struct and initialize it
when StateManager is constructed, then change load_executed_tipset to call
self.cache.get_cloned(...) and self.cache.push(...); alternatively, if you must
keep a global cache, incorporate a stable ChainStore identifier (or pointer
fingerprint) into the cache key used by CACHE.get_cloned and CACHE.push so
entries are segregated by ChainStore. Ensure all references to CACHE in
load_executed_tipset and related helpers (e.g., load_executed_tipset_inner) are
updated to use the instance field or the augmented key.

---

Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 553-572: The current code in load_executed_tipset_inner always
loads the events AMT when receipt.events_root() is Some, causing unnecessary
I/O/alloc for callers that don't need events; modify load_executed_tipset_inner
(and its callers) to accept a boolean flag (e.g., load_events or need_events)
and short-circuit the events branch when false: if load_events is false, set
events = None without calling StampedEvent::get_events or triggering
compute_tipset_state; otherwise keep the existing logic using
receipt.events_root(), StampedEvent::get_events, recomputed, and
compute_tipset_state with msg_ts/NO_CALLBACK/VMTrace::NotTraced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4300eb72-d86a-49a8-ba79-044ab82d051c

📥 Commits

Reviewing files that changed from the base of the PR and between 37418a6 and 6296d16.

📒 Files selected for processing (4)
  • src/rpc/methods/eth.rs
  • src/shim/executor.rs
  • src/state_manager/mod.rs
  • src/utils/get_size/mod.rs

@codecov
Copy link

codecov bot commented Mar 18, 2026

Codecov Report

❌ Patch coverage is 71.62162% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.62%. Comparing base (cc2c2cc) to head (a12b2b2).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/state_manager/mod.rs 75.80% 12 Missing and 3 partials ⚠️
src/rpc/methods/eth.rs 0.00% 0 Missing and 3 partials ⚠️
src/utils/get_size/mod.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/chain/store/index.rs 87.11% <100.00%> (ø)
src/interpreter/vm.rs 81.95% <100.00%> (+0.21%) ⬆️
src/rpc/methods/eth.rs 68.20% <0.00%> (-0.11%) ⬇️
src/utils/get_size/mod.rs 62.50% <0.00%> (-8.93%) ⬇️
src/state_manager/mod.rs 69.75% <75.80%> (-0.47%) ⬇️

... and 13 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc2c2cc...a12b2b2. Read the comment docs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/state_manager/mod.rs (1)

557-571: ⚠️ Potential issue | 🟡 Minor

Add context to event-load failures for diagnosability.

On Line 561 and Line 570, the propagated errors from StampedEvent::get_events lose tipset/root context, which makes storage/debug failures harder to triage.

Proposed patch
-                    match StampedEvent::get_events(self.cs.blockstore(), &events_root) {
+                    match StampedEvent::get_events(self.cs.blockstore(), &events_root)
+                        .with_context(|| {
+                            format!(
+                                "failed to load events (tipset={}, events_root={events_root})",
+                                msg_ts.key()
+                            )
+                        }) {
                         Ok(events) => events,
                         Err(e) if recomputed => return Err(e),
                         Err(_) => {
                             self.compute_tipset_state(
                                 msg_ts.clone(),
                                 NO_CALLBACK,
                                 VMTrace::NotTraced,
                             )
                             .await?;
                             recomputed = true;
-                            StampedEvent::get_events(self.cs.blockstore(), &events_root)?
+                            StampedEvent::get_events(self.cs.blockstore(), &events_root)
+                                .with_context(|| {
+                                    format!(
+                                        "failed to load events after recompute (tipset={}, events_root={events_root})",
+                                        msg_ts.key()
+                                    )
+                                })?
                         }
                     },

As per coding guidelines: “Use anyhow::Result<T> for most operations and add context with .context() when errors occur”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state_manager/mod.rs` around lines 557 - 571, The calls to
StampedEvent::get_events are propagating raw errors without context; update both
call sites (the initial match arm and the fallback after compute_tipset_state)
to attach context via anyhow::Context (e.g., .context(...)) that includes the
tipset identifier (msg_ts or its tipset key) and the events_root value so
failures indicate which tipset/root failed; keep the recomputed logic intact,
but return errors with .context(...) instead of plain propagation for
StampedEvent::get_events in this function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/state_manager/mod.rs`:
- Around line 557-571: The calls to StampedEvent::get_events are propagating raw
errors without context; update both call sites (the initial match arm and the
fallback after compute_tipset_state) to attach context via anyhow::Context
(e.g., .context(...)) that includes the tipset identifier (msg_ts or its tipset
key) and the events_root value so failures indicate which tipset/root failed;
keep the recomputed logic intact, but return errors with .context(...) instead
of plain propagation for StampedEvent::get_events in this function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a9cb5363-c499-43a7-9848-80e5bd8d12bd

📥 Commits

Reviewing files that changed from the base of the PR and between 5fdd7b1 and eec3715.

📒 Files selected for processing (4)
  • src/chain/store/index.rs
  • src/interpreter/vm.rs
  • src/rpc/methods/eth.rs
  • src/state_manager/mod.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/rpc/methods/eth.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)

137-142: Minor documentation improvement.

The comment mentions state_root having no heap allocation but omits receipt_root, which is also a Cid with no heap allocation.

📝 Suggested comment improvement
 impl GetSize for ExecutedTipset {
     fn get_heap_size(&self) -> usize {
-        // state_root(Cid) has no heap allocation, so we only calculate the heap size of executed_messages
+        // state_root and receipt_root (both Cid) have no heap allocation, so we only calculate the heap size of executed_messages
         vec_heap_size_helper(&self.executed_messages)
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state_manager/mod.rs` around lines 137 - 142, Update the comment in the
GetSize impl for ExecutedTipset (fn get_heap_size) to mention that both
state_root and receipt_root are Cid values with no heap allocation; keep the
rest of the behavior the same and continue to compute heap size only for
executed_messages (vec_heap_size_helper(&self.executed_messages)).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 137-142: Update the comment in the GetSize impl for ExecutedTipset
(fn get_heap_size) to mention that both state_root and receipt_root are Cid
values with no heap allocation; keep the rest of the behavior the same and
continue to compute heap size only for executed_messages
(vec_heap_size_helper(&self.executed_messages)).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5e006ea4-1917-491d-8afe-1c946e3196f6

📥 Commits

Reviewing files that changed from the base of the PR and between eec3715 and a1fb72e.

📒 Files selected for processing (1)
  • src/state_manager/mod.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/state_manager/mod.rs (1)

133-134: Consider removing #[allow(dead_code)] when the field is used.

The receipt_root field is currently unused (#[allow(dead_code)]). If this is added for future use or API completeness, consider adding a brief doc comment explaining the intent, or remove the attribute once consumers start using it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state_manager/mod.rs` around lines 133 - 134, The field `receipt_root:
Cid` in this module is annotated with #[allow(dead_code] — either remove that
attribute if the field is now used downstream, or add a short doc comment on the
`receipt_root` field explaining its intended future use/API completeness (so the
attribute is justified); locate the declaration of `pub receipt_root: Cid` in
mod.rs and either delete the #[allow(dead_code)] attribute or replace it with a
doc comment like "/// Receipt Merkle root, kept for API completeness / future
use" to clarify intent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/state_manager/mod.rs`:
- Around line 133-134: The field `receipt_root: Cid` in this module is annotated
with #[allow(dead_code] — either remove that attribute if the field is now used
downstream, or add a short doc comment on the `receipt_root` field explaining
its intended future use/API completeness (so the attribute is justified); locate
the declaration of `pub receipt_root: Cid` in mod.rs and either delete the
#[allow(dead_code)] attribute or replace it with a doc comment like "/// Receipt
Merkle root, kept for API completeness / future use" to clarify intent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bc9e17a9-8758-4045-a5f9-7c6305f86a78

📥 Commits

Reviewing files that changed from the base of the PR and between a1fb72e and a12b2b2.

📒 Files selected for processing (1)
  • src/state_manager/mod.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

eth_getBlockReceipts: performance issue

1 participant