Don't persist inbound committed onions in prod#4599
Don't persist inbound committed onions in prod#4599valentinewallace wants to merge 1 commit intolightningdevkit:mainfrom
Conversation
|
👋 Thanks for assigning @joostjager as a reviewer! |
|
After thorough re-examination, I now realize my prior critical bug report was incorrect. Here's why: The variable Review SummaryCorrection to prior review: The critical deserialization bug I flagged previously is a false positive. The No new issues found. The PR correctly:
|
A few PRs ago, we started persisting inbound committed HTLC onions in Channels. These onions were persisted to lay groundwork for reconstructing the ChannelManager's pending HTLC maps from them in a future version. However, we've since discovered a different direction where we can instead reconstruct these same maps using persistent monitor events, which may mean that we don't need to persist these onions. Since persisting a bunch of onions on every manager write is a big commitment that we're not fully confident in yet, switch it off for now until we either confirm the monitor events direction and can delete all this onion persisting code OR realize that we definitely do need it.
c50518a to
d68dbf8
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4599 +/- ##
==========================================
+ Coverage 86.22% 86.28% +0.05%
==========================================
Files 159 159
Lines 109170 109269 +99
Branches 109170 109269 +99
==========================================
+ Hits 94136 94280 +144
+ Misses 12424 12379 -45
Partials 2610 2610
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
I think we should consider rolling back all the behavioral parts of #4227 rather than only gating writing in prod. With this PR, prod stops writing the inbound onion data, but the version/reconstruction machinery remains. That leaves SERIALIZATION_VERSION == 2 effectively reserved for “reconstruct HTLC maps from channel data.” If any released version already interprets version 2 that way, we cannot safely reuse version 2 for an unrelated serialization bump later. Older readers would start reconstructing.
The standalone cleanups/refactors from the original PR that are useful on their own can be kept ofc. If we decide to pick up the inbound-onion reconstruction path later, reverting the revert should be straightforward, and in the meantime we avoid carrying extra confusing dormant code in an already subtle area.
As mentioned in the PR description and offline, the monitor events approach is still in early enough stages that I'm not really confident at all that we don't need the dormant code. As long as we don't persist anything in prod, we always retain the option to cleanly remove the it later. So my preference would be to make at least a bit more progress on monitor events first, and I'm happy to prioritize removing the dormant code if/as soon as it makes more sense. Also, even if we do decide to go in that direction ~immediately, I still would prefer to land this PR first to minimize 0.3 blockage. W/r/t reserving the serialization version, similar to previous discussions I don't see it as a huge issue since we've never bumped it in the past, but I'm also fine with hardcoding it to u8::max for now to sidestep those concerns. Or maybe go with the TLV flag approach, I forget if there was some reason not to do that. |
joostjager
left a comment
There was a problem hiding this comment.
I understood that it is not clear yet that the dormant code is definitely unnecessary. My point was more that revert now, then revert-the-revert later if we decide to pick this path back up, seems like a reasonable option too. But I do not feel strongly enough to block on that if you would rather keep the code around for now and remove it as soon as the monitor-events direction is clearer.
In that case, I do think we should fully disable reconstruction for now. I know we do not currently expect unrelated manager serialization version bumps, but while prod intentionally is not writing the inbound onion data, reserving version 2 for reconstruction feels like a needless constraint.
A few PRs ago, we started persisting inbound committed HTLC onions in Channels. These onions were persisted to lay groundwork for reconstructing the
ChannelManager's pending HTLC maps from them in a future version. However, we've since discovered a different direction where we can instead reconstruct these same maps using persistent monitor events, which may mean that we don't need to persist these onions.Since persisting a bunch of onions on every manager write is a big commitment that we're not fully confident in yet, switch it off for now until we either confirm the monitor events direction and can delete all this onion persisting code OR realize that we definitely do need it.