fix: pre-populate rule cache on first NodeReconciler pass to prevent bootstrap-only taint bypass#254
Conversation
✅ Deploy Preview for node-readiness-controller canceled.
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Karman580 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @Karman580. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
|
…bootstrap-only taint bypass RuleReadinessController maintains an in-memory ruleCache populated lazily as RuleReconciler processes each NodeReadinessRule. NodeReconciler derives applicable rules exclusively from this cache. Both controllers start consuming their work queues concurrently after informer sync, creating a race where node events are processed against an empty or partially-warm cache. For continuous enforcement this window is self-healing. For bootstrap-only rules it is permanent: if a node is evaluated before its applicable rule reaches the cache the taint is never applied, the bootstrap completion annotation is never written, and the admission gate is silently bypassed. Add ensureCacheWarmed to RuleReadinessController. On the first call it lists all NodeReadinessRules from the already-synced informer cache and adds any rule not yet present in ruleCache. Subsequent calls are no-ops guarded by an atomic.Bool. Existing cache entries (including rules already marked for deletion by RuleReconciler) are intentionally preserved so that in-flight state is not overwritten. Call ensureCacheWarmed at the top of NodeReconciler.Reconcile before any rule evaluation takes place. Add two integration tests covering the cold-start scenario: - verifies that a bootstrap-only taint is applied even when ruleCache is empty at reconcile time (rule discovered via warm-up) - verifies that the warm-up is not repeated on subsequent reconciles
f9aba8b to
5c83d8f
Compare
|
recheck |
|
Thanks again for writing the gist, @Karman580. I tried to reproduce it locally. You' are using what you are actually seeing with your repro steps is a different scenario (the default case (case 4) in the switch block) - IMO, a better reproduction setup would be to use a custom node status condition that kind does not satisfy by default, so, we can actually observe the stated race condition. I did try do this ^ locally. While the PR change address the one scenario you described (ie. on a controller restart, (Node Controller) Reconciler will first seed the in-memory rule cache from the informer, before it proceeds with further evaluation on the node, but we still need to address the following gap (which remains regardless of the cache is warm or not) bootstrap annotation is only added in case 1 of the switch block In case of a node which is already healthy on first evaluation (the case that your gist is actually testing with And what happens then is if that node later has a transient condition failure (e.g. a kubelet restart or network interruption that flips the condition to unknown/False), the controller has no record of its bootstrap stage being completed and it will re-apply the taint and block any workload. I think we should discuss it in the upcoming project call about whether we should design the default case ( cc: @ajaysundark |
|
Thanks for the detailed review and for reproducing it locally. You're right the current repro using Ready=True overlaps with the default healthy-node path rather than isolating the cache timing race itself. I also see the larger bootstrap annotation gap in the shouldRemoveTaint=true + currentlyHasTaint=false case that you pointed out. I’ll revisit the reproduction using a custom condition to isolate the cache behavior more clearly, and I agree the bootstrap annotation semantics are probably better discussed separately before I iterate further on the PR. |
Description
RuleReadinessControllermaintains an in-memoryruleCachepopulated lazily asRuleReconcilerprocesses eachNodeReadinessRule.NodeReconcilerderives applicable rules exclusively from this cache. Both controllers start consuming their work queues concurrently after informer sync, creating a race where node events are processed against an empty or partially-warm cache.For
continuousenforcement this window is self-healing — the next condition, taint, or label change re-triggers evaluation and the controller converges. Forbootstrap-onlyrules it is a permanent correctness gap: if a node is evaluated before its applicable rule reaches the cache, the taint is never applied,markBootstrapCompletedis never reached, the bootstrap completion annotation is never written, and the intended admission gate is silently bypassed with no error surfaced.Root Cause
ensureCacheWarmeddid not exist. TheruleCachewas built entirely as a side effect ofRuleReconciler.Reconcile(). Although the controller-runtime manager guarantees informers are synced before controllers start, there was no mechanism to seedruleCachefrom the already-available informer data beforeNodeReconcilerbegan processing events.Fix
Add
ensureCacheWarmedtoRuleReadinessController. On the first call it lists allNodeReadinessRulesfrom the already-synced informer cache (no live API round-trip) and adds any rule not yet present inruleCache. Subsequent calls are no-ops guarded by anatomic.Bool. Existing cache entries — including rules already marked for deletion byRuleReconciler— are intentionally preserved so that in-flight state is not overwritten.ensureCacheWarmedis called at the top ofNodeReconciler.Reconcilebefore any rule evaluation takes place.Related Issue
Fixes #253
Type of Change
/kind bug
Testing
cold-start rule cache warm-upintegration test context innode_controller_test.gowith two cases:ruleCacheis empty at reconcile time (the rule is discovered via warm-up).cacheWarmedremains true).internal/controller73.5% → 73.9%.Checklist
make testpassesmake lintpassesDoes this PR introduce a user-facing change?