MDEV-39600 Reduce contention in buf_flush_ahead()#5078
Conversation
|
|
There was a problem hiding this comment.
Pull request overview
This PR reduces contention in buf_flush_ahead() by making the non-furious async flush target update lock-free while preserving the mutex-protected furious checkpoint path.
Changes:
- Splits page cleaner idle state into a separate relaxed atomic flag.
- Changes non-furious
buf_flush_ahead()to updatebuf_flush_async_lsnvia CAS. - Adds
compare_exchange_weak()toAtomic_relaxed.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
storage/innobase/include/buf0buf.h |
Separates page cleaner idle state from page_cleaner_status. |
storage/innobase/buf/buf0flu.cc |
Implements lock-free async LSN bumping and updates page-cleaner wakeup handling. |
include/my_atomic_wrapper.h |
Adds weak compare-exchange support to the atomic wrapper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request refactors the page cleaner's idle state management by replacing a bitmask flag with an atomic boolean and optimizes the buf_flush_ahead function using a lock-free CAS loop for non-furious LSN updates. These changes aim to reduce mutex contention by allowing lock-free idle status checks. However, the refactoring introduced a critical compilation error because the wake label was removed while goto wake statements remain in the code. The feedback highlights the need to restore this label to maintain the shared signaling logic.
When calling buf_flush_ahead() with furious=false, the function updates the async target buf_flush_async_lsn and, if necessary notifies the page cleaner thread to start flushing more eagerly, to avoid a long stall later when checkpoint will be required. When many threads call buf_flush_ahead() with furious=false, if the update of buf_flush_async_lsn is gated inside buf_pool.flush_list_mutex, this can cause significant contention, just to deliver the new buf_flush_async_lsn value and a potential notification to the page cleaner thread. This commit enables, for the non furious case, lock-free update of buf_flush_async_lsn and lock-free check of the page cleaner idle flag, to avoid the contention on buf_pool.flush_list_mutex. The lock is acquired by the CAS-loop winner thread and only after the page cleaner is observed idle. The cleaner clears buf_flush_async_lsn via CAS, so a concurrent lock-free bump is preserved across the clear. Since the reads are not protected by the mutex, there is a chance that the decision to not signal the page cleaner thread is wrong. In the worst case though, under sustained redo-log pressure, eventually a wakeup signal will be delivered to the page cleaner thread anyway, so the risk of a long stall is limited. Similar reasoning applies to the possibly wrong decision taken by the page cleaner thread to go to idle: eventually it will be woken up.
Uh oh!
There was an error while loading. Please reload this page.