Skip to content

leveled_bookie:status/1#17

Merged
martinsumner merged 34 commits intoOpenRiak:openriak-3.4from
TI-Tokyo:tiot/openriak-3.4/leveled_bookie_status
Feb 12, 2026
Merged

leveled_bookie:status/1#17
martinsumner merged 34 commits intoOpenRiak:openriak-3.4from
TI-Tokyo:tiot/openriak-3.4/leveled_bookie_status

Conversation

@hmmr
Copy link

@hmmr hmmr commented Oct 28, 2025

This implements martinsumner#478.

When called here, it returns a map containing reportable items described in the issue above.

(dev2@127.0.0.1)12> riak_kv_vnode:vnode_status(riak_core_vnode_manager:all_index_pid(riak_kv_vnode)).
[{1347321821914426127719021955160323408745312813056,
  [{backend_status,riak_kv_leveled_backend,
                   #{ledger_cache_size => 4,
                     journal_last_compaction_result => {0,0.0},
                     get_sample_count => 0,get_body_time => 0,
                     head_sample_count => 0,head_rsp_time => 0,
                     put_sample_count => 0,put_prep_time => 0,put_ink_time => 0,
                     put_mem_time => 0}},
   {vnodeid,<<"}ÍyرYdå">>},
   {counter,140003},
   {counter_lease,150000},
   {counter_lease_size,10000},
   {counter_leasing,false}]},
...

Not all items can be available at the time of calling, in which case the backend_status will not include the corresponding keys.

In riak_control, the results will show as follows:
Screenshot_2025-10-29_01-11-07
Screenshot_2025-10-29_01-11-33

@hmmr
Copy link
Author

hmmr commented Oct 28, 2025

Marking as WIP pending work on the last item ("Mean level for recent fetches").

@hmmr hmmr changed the title WIP leveled_bookie:status/1 leveled_bookie:status/1 Oct 29, 2025
@martinsumner martinsumner moved this to Todo in OpenRiak 3.4.1 Jan 7, 2026
@martinsumner martinsumner moved this from Todo to Ready For Review in OpenRiak 3.4.1 Jan 7, 2026
@martinsumner martinsumner self-requested a review January 7, 2026 13:25
@martinsumner
Copy link

Can we add a test to the ct tests, or perhaps just extend one of the tests in basic_SUITE.

It would be good to verify that the results returned are as expected. Some may just need to be range checked, but you can get the PIDs of the Inker/Penciller via leveled_bookie:book_returnactors/1 to absolutely verify things like the penciller cache size.

Obviously fails the coverage check at the moment as book_status/1 is never called:

|------------------------|------------|
  |                module  |  coverage  |
  |------------------------|------------|
  |        leveled_bookie  |       99%  |
  |           leveled_cdb  |      100%  |
  |         leveled_codec  |      100%  |
  |        leveled_ebloom  |      100%  |
  |          leveled_eval  |      100%  |
  |        leveled_filter  |      100%  |
  |          leveled_head  |      100%  |
  |        leveled_iclerk  |      100%  |
  |     leveled_imanifest  |      100%  |
  |         leveled_inker  |      100%  |
  |           leveled_log  |      100%  |
  |       leveled_monitor  |       97%  |
  |        leveled_pclerk  |      100%  |
  |     leveled_penciller  |      100%  |
  |     leveled_pmanifest  |      100%  |
  |          leveled_pmem  |      100%  |
  |        leveled_runner  |      100%  |
  |         leveled_setop  |      100%  |
  |           leveled_sst  |      100%  |
  |      leveled_sstblock  |      100%  |
  |        leveled_tictac  |      100%  |
  |          leveled_tree  |      100%  |
  |          leveled_util  |      100%  |
  |------------------------|------------|
  |                 total  |       99%  |
  |------------------------|------------|

There are tests in basic_SUITE that prompt journal compaction, so you cna follow the pattern to check results are expected after a compaction.

@hmmr hmmr changed the title leveled_bookie:status/1 WIP leveled_bookie:status/1 Jan 15, 2026
@martinsumner
Copy link

The PUT will call leveled_monitor:maybe_time/2 - so sometimes it will be timed, and sometimes not. So I'm not sure that the test will pass reliably.

I do think it should try a number of operations, so that we can check all the values change to the right order of magnitude. There are a lot of load functions, or you can add a fetch and validation of the book status into one of the other tests.

Also, what happened to reporting the journal compaction scores and results?

@martinsumner
Copy link

One other thing to note, which came up in a thread about stats - is the riak admin vnode-status command.

I think it would be worth looking at the formatting of the output of this CLI command as part of this. There's an open issue from the basho days: basho/riak_kv#343

As we have the OTP json module available, perhaps we could fix this by just pretty-printing json, so that we don't end up coupling some presentation logic in the CLI to the format of the status output.

@martinsumner
Copy link

I think you still need to increment on open_reader. I expected you needed an additional increment on cdb_roll, not to remove the open_reader counts.

Otherwise, when a store is restarted, it will simply not count all the old cdb files it starts as read-only. Also, it will not count new files created by compaction.

@martinsumner martinsumner moved this from Ready For Review to Done in OpenRiak 3.4.1 Feb 4, 2026
@martinsumner martinsumner moved this from Done to Review In Progress in OpenRiak 3.4.1 Feb 4, 2026
@hmmr hmmr force-pushed the tiot/openriak-3.4/leveled_bookie_status branch from dd7eeff to 0bf6905 Compare February 5, 2026 01:27
@martinsumner
Copy link

martinsumner commented Feb 10, 2026

Some comments on some of the stats:

  • recent_putgethead_counts; doesn't seem to be connected (i.e. can't be updated).
  • avg_compaction_score; doesn't seem to be connected (i.e. can't be updated).
    • Would be preferable I think to avg_compaction_score_sample, as avg_compaction_score_sample is not a sample in that it is not a random slice of the journal scores (the scores are in-order so it will always show one end of the Journal only).
    • Would be better to have avg_compaction_score and max_compaction_score, drop the sample scores.
  • penciller_work_backlog_status - I think this probably needs to be updated in all three case statement in leveled_penciller not just the final one. The first case would be an update of {0, false, false}, the second {WC, false, true} and the third case is OK as-is.
  • journal_last_compaction_time/penciller_last_merge_time - would it be better to present this in a human readable format? Especially if we're to sue this more generally other than for riak_control.

@hmmr hmmr force-pushed the tiot/openriak-3.4/leveled_bookie_status branch from 9aed1a6 to b2915f5 Compare February 11, 2026 02:44
@hmmr
Copy link
Author

hmmr commented Feb 11, 2026

I have

  • removed recent_putgethead_counts, which was a resurfaced leftover from early work (those counts exist as individual items).
  • did the suggested to penciller_work_backlog_status, which will provide it with a more meaningful value than undefined.
  • on compaction score, I am now reporting min_ and max_compaction_score, calculated on every avg_compaction_score_update; the sample is no longer included in the report.
  • I would still keep the various times in the report as unixtime milliseconds, and only convert at the sites where they are to be presented, i.e., in Elm code in riak_control, and in the json printed by riak admin vnode-status (in the works).

Andriy Zavada and others added 2 commits February 11, 2026 11:21
Gather journal last compaction stats and update in one call.  Allows for scoring of all files (not just a sample).

Ensure coverage of a status check on a snapshot bookie.

Resolve eqwalizer issue with merge of a map with undefined (the level_files_count defaults to an empty map and so will be present and cannot be undefined)
cast, cdb_roll, State = #state{last_position = LP}
) when
?IS_DEF(LP)
->

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a CDB file is rolled - it should also +1 to the n_active_journal_files_update

@martinsumner martinsumner merged commit 37db2f2 into OpenRiak:openriak-3.4 Feb 12, 2026
2 checks passed
@github-project-automation github-project-automation bot moved this from Review In Progress to Done in OpenRiak 3.4.1 Feb 12, 2026
@hmmr hmmr deleted the tiot/openriak-3.4/leveled_bookie_status branch February 12, 2026 12:36
martinsumner added a commit that referenced this pull request Feb 12, 2026
Return a map of status information about the bookie (when the bookie is not a snapshot).

The content of the status may change in future releases.

---------

Co-authored-by: Andriy Zavada <andriy.zavada@tiot.jp>
Co-authored-by: Martin Sumner <martin.sumner@adaptip.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants