Conversation
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds a complete quota management system to Cortex: a new ProjectQuota Kubernetes CRD, an HTTP quota API endpoint, and a quota controller that tracks per-project VM resource usage and computes pay-as-you-go quotas by deducting committed resources. The implementation includes periodic full reconciliation, watch-based reconciliation on spec changes and CR usage updates, and incremental hypervisor-diff reconciliation for VM add/remove events. ChangesProject Quota Management System
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly Related PRs
Suggested Reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
follow up of #783 |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (3)
internal/scheduling/reservations/commitments/api/quota_monitor.go (1)
29-33: ⚡ Quick winPre-initialize the 503 series too.
HandleQuotahas an explicit service-disabled path, so if we're pre-creating the common status-code series here,503should be included as well. Otherwise dashboards/alerts won't see that label until the first disabled request happens.Suggested diff
- for _, statusCode := range []string{"204", "400", "405", "500"} { + for _, statusCode := range []string{"204", "400", "405", "500", "503"} { m.requestCounter.WithLabelValues(statusCode) m.requestDuration.WithLabelValues(statusCode) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/scheduling/reservations/commitments/api/quota_monitor.go` around lines 29 - 33, Pre-create the 503 status series alongside the others so dashboards/alerts see service-disabled metrics before the first disabled request; update the loop that pre-initializes status codes (the array passed to the for loop that calls m.requestCounter.WithLabelValues and m.requestDuration.WithLabelValues) to include "503" as well—this affects the same block used by HandleQuota's service-disabled path and ensures the "503" label exists from startup.internal/scheduling/reservations/failover/vm_source_test.go (1)
403-405: ⚡ Quick winAdd unit cases for the new deleted-server paths.
This mock change only restores interface compliance;
IsServerActive,GetDeletedVMInfo, andCreatedAtpropagation inDBVMSourcestill have no assertions here. A few table-driven cases forDELETED,nil, deleted-server flavor lookup, andCreatedAtwould lock down the new quota path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/scheduling/reservations/failover/vm_source_test.go` around lines 403 - 405, Add table-driven unit tests exercising the new deleted-server code paths: update tests around mockNovaReader.GetDeletedServerByID and DBVMSource to include cases for (1) a DELETED deleted-server result, (2) GetDeletedServerByID returning nil, (3) deleted-server that requires flavor lookup (ensure flavor ID/name resolution happens), and (4) propagation of CreatedAt from the DeletedServer into the DB VM. For each case assert DBVMSource behavior and side-effects by checking IsServerActive and GetDeletedVMInfo are invoked as expected and that the resulting VM record's CreatedAt (and quota decision path) matches the deleted-server input.internal/scheduling/reservations/commitments/api/quota_test.go (1)
352-354: ⚡ Quick winUse
testlib.Ptrinstead of a local pointer helper.
boolPtr()duplicates the shared pointer helper pattern and drifts from the repo's test convention.As per coding guidelines,
**/*_test.go: Usetestlib.Ptrfor test cases that require pointer values.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/scheduling/reservations/commitments/api/quota_test.go` around lines 352 - 354, The helper function boolPtr(b bool) is duplicating a shared pointer helper; remove boolPtr and replace its usages with the shared test helper testlib.Ptr(b) (update imports to include testlib if missing) so tests follow the repo convention; search for the function name boolPtr and replace calls like boolPtr(true) / boolPtr(false) with testlib.Ptr(true) / testlib.Ptr(false) and delete the boolPtr function definition.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/settings.local.json:
- Around line 1-8: The permissions block in settings.local.json is too
broad—replace the Read(//root/**) rule with least-privilege paths that the PR
actually needs (e.g., Read(//root/path/to/needed/dir) or explicit filenames) and
remove any wildcard that grants repository‑wide secret access; ensure the
"permissions" object only contains allowed actions for specific resources, and
make sure settings.local.json is listed in .gitignore or otherwise prevented
from being committed/used in CI/shared environments so the local-only settings
aren’t relied upon in shared workflows.
In `@internal/scheduling/reservations/commitments/api/quota.go`:
- Around line 72-84: The code currently allows creating/updating a ProjectQuota
with an empty DomainID when req.ProjectMetadata is absent; change the logic in
the block around req.ProjectMetadata.Unpack() so that if Unpack() returns false
you immediately reject the request with api.quotaError(...,
http.StatusBadRequest, "missing required domainID in project metadata",
startTime), and also validate that after unpack the derived domainID (variable
domainID) is non-empty—if it's empty return a 400 similarly; keep the existing
UUID consistency check and use the same api.quotaError pattern for these new
validation failures.
- Around line 111-160: The current Get→Create/Update flow for ProjectQuota
(crdName via projectQuotaCRDName) races with concurrent reconciles and can fail
with update conflicts; modify the write path to be conflict-safe by wrapping the
update logic in a retry-on-conflict loop (e.g., retry.RetryOnConflict) or by
using controllerutil.CreateOrUpdate on v1alpha1.ProjectQuota to compute and
apply spec changes atomically; ensure you preserve setting Spec.Quota,
Spec.ProjectName, Spec.DomainID and Spec.DomainName and log/create errors as
before but retry on conflicts instead of returning a 409/500 immediately.
In `@internal/scheduling/reservations/commitments/config.go`:
- Around line 61-64: The bool field EnableQuotaAPI in the Config struct
currently defaults to true only in DefaultConfig() but remains false when config
is unmarshaled into a zero-value Config and then passed through ApplyDefaults();
to fix this either (A) change the field to a pointer (*bool EnableQuotaAPI) and
update ApplyDefaults() to set it to pointer(true) when nil, or (B) ensure
decoding/UNMARSHAL logic constructs config by starting from DefaultConfig() and
then overlays file/flags onto it before calling ApplyDefaults(); update
ApplyDefaults(), DefaultConfig(), and any decoding flow that constructs Config
(referencing EnableQuotaAPI, ApplyDefaults, DefaultConfig) so the effective
default is unambiguous and tests reflect the chosen approach.
In `@internal/scheduling/reservations/failover/vm_source.go`:
- Around line 29-31: The VM CreatedAt field is not propagated when VMs are
rebuilt via buildVMsFromHypervisors called by ListVMsOnHypervisors(..., true),
causing callers to see an empty timestamp; update buildVMsFromHypervisors to
copy pgVM.CreatedAt into the new VM struct (and any other rebuild paths that
construct VM instances from pgVM) so CreatedAt flows through the
hypervisor-trust path; locate uses of pgVM and the VM constructor/assignment in
buildVMsFromHypervisors and add copying of the CreatedAt field (and mirror the
same change in the other rebuild block referenced around the same area).
In `@internal/scheduling/reservations/quota/controller.go`:
- Around line 396-427: The code incorrectly uses
ProjectQuota.Status.LastReconcileAt as the "last full reconcile" watermark;
introduce and use a dedicated field (e.g.
ProjectQuota.Status.LastFullReconcileAt) that is only updated by the full
periodic reconcile, update isVMNewSinceLastReconcile to compare VM creation time
against LastFullReconcileAt instead of LastReconcileAt, and change the PAYG-only
path and applyDeltaAndUpdateStatus() so they no longer overwrite
LastFullReconcileAt (they may continue to update LastReconcileAt for any status
change). Ensure the new field is set only in the full reconcile code path and
that isVMNewSinceLastReconcile references that new field when deciding whether a
VM is new.
- Around line 554-568: The ProjectQuota watch in SetupWithManager currently has
no predicate and thus enqueues on status-only updates; add a predicate (e.g.
projectQuotaSpecChangePredicate) to the For(&v1alpha1.ProjectQuota{}) watch that
filters out status-only changes by returning true only when Generation or Spec
changed (similar logic to crUsedAmountChangePredicate()). Implement
projectQuotaSpecChangePredicate() to compare old and new objects and only allow
events where old.GetGeneration() != new.GetGeneration() or a deep-equal check of
Spec differs, then pass it into builder.WithPredicates(...) for the ProjectQuota
watch in SetupWithManager.
In `@internal/scheduling/reservations/quota/metrics.go`:
- Around line 71-77: In QuotaMetrics.RecordUsage, avoid leaking metric label
series by deleting label values when a usage is removed: if totalUsage,
paygUsage and crUsage are all zero (or otherwise indicate the
resource/project/AZ was removed), call
m.totalUsageGauge.DeleteLabelValues(projectID, az, resource),
m.paygUsageGauge.DeleteLabelValues(...), and
m.crUsageGauge.DeleteLabelValues(...) instead of Set; otherwise continue to Set
the gauges as currently implemented. Ensure you reference the existing symbols
RecordUsage, totalUsageGauge, paygUsageGauge, crUsageGauge and use
DeleteLabelValues to prevent unbounded cardinality growth.
---
Nitpick comments:
In `@internal/scheduling/reservations/commitments/api/quota_monitor.go`:
- Around line 29-33: Pre-create the 503 status series alongside the others so
dashboards/alerts see service-disabled metrics before the first disabled
request; update the loop that pre-initializes status codes (the array passed to
the for loop that calls m.requestCounter.WithLabelValues and
m.requestDuration.WithLabelValues) to include "503" as well—this affects the
same block used by HandleQuota's service-disabled path and ensures the "503"
label exists from startup.
In `@internal/scheduling/reservations/commitments/api/quota_test.go`:
- Around line 352-354: The helper function boolPtr(b bool) is duplicating a
shared pointer helper; remove boolPtr and replace its usages with the shared
test helper testlib.Ptr(b) (update imports to include testlib if missing) so
tests follow the repo convention; search for the function name boolPtr and
replace calls like boolPtr(true) / boolPtr(false) with testlib.Ptr(true) /
testlib.Ptr(false) and delete the boolPtr function definition.
In `@internal/scheduling/reservations/failover/vm_source_test.go`:
- Around line 403-405: Add table-driven unit tests exercising the new
deleted-server code paths: update tests around
mockNovaReader.GetDeletedServerByID and DBVMSource to include cases for (1) a
DELETED deleted-server result, (2) GetDeletedServerByID returning nil, (3)
deleted-server that requires flavor lookup (ensure flavor ID/name resolution
happens), and (4) propagation of CreatedAt from the DeletedServer into the DB
VM. For each case assert DBVMSource behavior and side-effects by checking
IsServerActive and GetDeletedVMInfo are invoked as expected and that the
resulting VM record's CreatedAt (and quota decision path) matches the
deleted-server input.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f410d3c9-ddee-4e79-9ca4-43f2b4447fd6
📒 Files selected for processing (22)
.claude/settings.local.jsonapi/v1alpha1/project_quota_types.goapi/v1alpha1/zz_generated.deepcopy.gohelm/library/cortex/files/crds/cortex.cloud_projectquotas.yamlinternal/scheduling/external/nova.gointernal/scheduling/reservations/commitments/api/handler.gointernal/scheduling/reservations/commitments/api/info.gointernal/scheduling/reservations/commitments/api/info_test.gointernal/scheduling/reservations/commitments/api/quota.gointernal/scheduling/reservations/commitments/api/quota_monitor.gointernal/scheduling/reservations/commitments/api/quota_test.gointernal/scheduling/reservations/commitments/config.gointernal/scheduling/reservations/commitments/usage.gointernal/scheduling/reservations/failover/integration_test.gointernal/scheduling/reservations/failover/vm_source.gointernal/scheduling/reservations/failover/vm_source_test.gointernal/scheduling/reservations/quota/config.gointernal/scheduling/reservations/quota/context.gointernal/scheduling/reservations/quota/controller.gointernal/scheduling/reservations/quota/controller_test.gointernal/scheduling/reservations/quota/integration_test.gointernal/scheduling/reservations/quota/metrics.go
| func (m *QuotaMetrics) RecordUsage(projectID, az, resource string, totalUsage, paygUsage, crUsage int64) { | ||
| if m == nil { | ||
| return | ||
| } | ||
| m.totalUsageGauge.WithLabelValues(projectID, az, resource).Set(float64(totalUsage)) | ||
| m.paygUsageGauge.WithLabelValues(projectID, az, resource).Set(float64(paygUsage)) | ||
| m.crUsageGauge.WithLabelValues(projectID, az, resource).Set(float64(crUsage)) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and read the metrics.go file
fd metrics.go internal/schedulingRepository: cobaltcore-dev/cortex
Length of output: 194
🏁 Script executed:
# Get an overview of the file to understand its structure
wc -l internal/scheduling/reservations/quota/metrics.goRepository: cobaltcore-dev/cortex
Length of output: 118
🏁 Script executed:
# Read the entire file to understand the QuotaMetrics structure and any cleanup logic
cat -n internal/scheduling/reservations/quota/metrics.goRepository: cobaltcore-dev/cortex
Length of output: 3740
🏁 Script executed:
# Search for any deletion or cleanup logic in the quota module
rg "DeleteLabelValues|Delete\(|Reset\(" internal/scheduling/reservations/quota/Repository: cobaltcore-dev/cortex
Length of output: 389
🏁 Script executed:
# Search for calls to RecordUsage to understand the lifecycle and context
rg "RecordUsage" internal/scheduling/reservations/quota/Repository: cobaltcore-dev/cortex
Length of output: 478
🏁 Script executed:
# Check if there are any other metric management methods in related quota files
fd "\.go$" internal/scheduling/reservations/quota/ --exec grep -l "GaugeVec\|Delete" {}Repository: cobaltcore-dev/cortex
Length of output: 285
🏁 Script executed:
# Find the line where RecordUsage is called and read context around it
rg -B 10 -A 10 "c\.Metrics\.RecordUsage" internal/scheduling/reservations/quota/controller.goRepository: cobaltcore-dev/cortex
Length of output: 869
🏁 Script executed:
# Get more context on the overall structure of the controller
wc -l internal/scheduling/reservations/quota/controller.goRepository: cobaltcore-dev/cortex
Length of output: 122
🏁 Script executed:
# Search for any cleanup or eviction logic in the controller
rg "evict|cleanup|remove.*label|forget" internal/scheduling/reservations/quota/controller.goRepository: cobaltcore-dev/cortex
Length of output: 47
🏁 Script executed:
# Find where these usage maps are built to understand what entries they contain
rg -B 5 "for resourceName, total := range totalUsage" internal/scheduling/reservations/quota/controller.goRepository: cobaltcore-dev/cortex
Length of output: 469
🏁 Script executed:
# Look for where totalUsage, paygUsage, crUsage are populated
rg "totalUsage\s*:=|totalUsage\s*=" internal/scheduling/reservations/quota/controller.go | head -20Repository: cobaltcore-dev/cortex
Length of output: 125
🏁 Script executed:
# Check if there's any cleanup logic for deleted projects
rg "reconcil|Reconcil" internal/scheduling/reservations/quota/controller.go | head -15Repository: cobaltcore-dev/cortex
Length of output: 828
🏁 Script executed:
# Search for DeleteLabelValues patterns in other metric files to see if this is handled elsewhere
rg "DeleteLabelValues" internal/scheduling/Repository: cobaltcore-dev/cortex
Length of output: 47
🏁 Script executed:
# Check other metric files in the codebase for cardinality management patterns
fd "metrics\.go$" internal/scheduling/ --exec grep -l "GaugeVec" {}Repository: cobaltcore-dev/cortex
Length of output: 115
🏁 Script executed:
# Look at another metrics file to see if there are deletion patterns
cat -n internal/scheduling/reservations/commitments/api/change_commitments_metrics.go | head -100Repository: cobaltcore-dev/cortex
Length of output: 1688
Add cleanup for removed usage label sets.
These gauges only ever Set() values. When a project/resource/AZ disappears, the old series is never deleted, so Prometheus will keep exporting stale usage and the project_id cardinality will grow monotonically over time. Add DeleteLabelValues() calls when projects or resources are removed to prevent unbounded metric cardinality growth.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/scheduling/reservations/quota/metrics.go` around lines 71 - 77, In
QuotaMetrics.RecordUsage, avoid leaking metric label series by deleting label
values when a usage is removed: if totalUsage, paygUsage and crUsage are all
zero (or otherwise indicate the resource/project/AZ was removed), call
m.totalUsageGauge.DeleteLabelValues(projectID, az, resource),
m.paygUsageGauge.DeleteLabelValues(...), and
m.crUsageGauge.DeleteLabelValues(...) instead of Set; otherwise continue to Set
the gauges as currently implemented. Ensure you reference the existing symbols
RecordUsage, totalUsageGauge, paygUsageGauge, crUsageGauge and use
DeleteLabelValues to prevent unbounded cardinality growth.
…OnConflict in API, generation predicate - Add LastFullReconcileAt to ProjectQuotaStatus (separate watermark for incremental add detection, not advanced by PaygUsage-only recomputes) - isVMNewSinceLastReconcile now uses LastFullReconcileAt instead of LastReconcileAt to avoid false skips after CR-triggered recomputes - Add domainID validation in HandleQuota (400 if missing) - Wrap HandleQuota create/update in RetryOnConflict to handle concurrent status updates from the quota controller - Add projectQuotaGenerationChangePredicate (GenerationChangedPredicate) on ProjectQuota For() watch to prevent infinite reconcile loops from status-only updates - Add CreatedAt field to VM struct, populated in buildVMsFromHypervisors Ref: BLI #376
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/manager/main.go`:
- Around line 698-703: The quota controller reads failover.FailoverConfig and
validates DatasourceName before defaults are applied; fix by retrieving the
config into a variable via conf.GetConfigOrDie[failover.FailoverConfig](), call
its ApplyDefaults() method on that instance, then extract DatasourceName (e.g.
replace the direct call with: cfg :=
conf.GetConfigOrDie[failover.FailoverConfig](); cfg.ApplyDefaults();
datasourceName := cfg.DatasourceName) and keep the existing validation/logging
logic for quota controller.
In `@helm/library/cortex/files/crds/cortex.cloud_projectquotas.yaml`:
- Around line 115-118: The CRD currently omits spec.quota from the required list
allowing ProjectQuota instances without a quota payload; update the CRD schema
by adding "quota" to the spec.required array so spec.quota is mandatory,
ensuring the ProjectQuota resource enforces the API contract (verify the change
near the CRD's spec.required/type object block and the ProjectQuota schema
definition).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 431ec4e4-123a-4221-afdb-1dbdd68d1907
📒 Files selected for processing (12)
api/v1alpha1/project_quota_types.goapi/v1alpha1/zz_generated.deepcopy.gocmd/manager/main.gohelm/bundles/cortex-nova/values.yamlhelm/library/cortex/files/crds/cortex.cloud_projectquotas.yamlhelm/library/cortex/templates/rbac/role.yamlinternal/scheduling/reservations/commitments/api/quota.gointernal/scheduling/reservations/commitments/api/quota_test.gointernal/scheduling/reservations/failover/vm_source.gointernal/scheduling/reservations/quota/controller.gointernal/scheduling/reservations/quota/controller_test.gointernal/scheduling/reservations/quota/integration_test.go
✅ Files skipped from review due to trivial changes (2)
- helm/library/cortex/templates/rbac/role.yaml
- helm/bundles/cortex-nova/values.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
- internal/scheduling/reservations/commitments/api/quota_test.go
- internal/scheduling/reservations/commitments/api/quota.go
- api/v1alpha1/project_quota_types.go
- internal/scheduling/reservations/quota/integration_test.go
- internal/scheduling/reservations/quota/controller_test.go
- internal/scheduling/reservations/quota/controller.go
# Conflicts: # helm/bundles/cortex-nova/values.yaml # helm/library/cortex/templates/rbac/role.yaml # internal/scheduling/reservations/commitments/api/info.go # internal/scheduling/reservations/commitments/api/info_test.go # internal/scheduling/reservations/commitments/config.go
…merge After merging main, the CommittedResource CRD changed from Status.UsedAmount (*resource.Quantity) to Status.UsedResources (map[string]resource.Quantity). This commit updates: - computeCRUsage: reads UsedResources["memory"] and UsedResources["cpu"] and converts memory bytes to commitment units (multiples of smallest flavor) - applyDeltaAndUpdateStatus: accepts flavorGroups parameter for unit conversion - Reconcile: fetches flavorGroups before computing CRUsage - crUsedAmountChangePredicate: compares UsedResources map instead of UsedAmount - Tests: updated to use UsedResources with proper byte/core values
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/scheduling/reservations/quota/controller.go`:
- Around line 100-111: The fmt.Errorf calls in this file return error strings
with capitalized resource/type names (e.g. "failed to list ProjectQuota CRDs",
"failed to list CommittedResource CRDs", and other occurrences referencing
TotalUsage); update those error messages to use lowercase identifiers (e.g.
"failed to list projectquota crds", "failed to list committedresource crds",
"failed to compute totalusage", etc.) so they conform to the project's lowercase
error-string guideline—locate each fmt.Errorf in this controller (including the
calls wrapping c.List and the similar fmt.Errorf usages later around the
TotalUsage handling) and normalize the text to all-lowercase while preserving
the original context and wrapping of the underlying error (%w).
- Around line 759-808: The code currently reads state/resource
type/AZ/flavor-group from cr.Spec while translating cr.Status.UsedResources,
which can mis-bucket usage during spec transitions; change the lookup to prefer
cr.Status.AcceptedSpec when present (use cr.Status.AcceptedSpec.ResourceType,
.FlavorGroupName, .AvailabilityZone) and only fall back to cr.Spec when
AcceptedSpec is nil/empty, then use those resolved values when mapping
ResourceType, computing resourceName
(commitments.ResourceNameRAM/ResourceNameCores), and accumulating into
usage.PerAZ so UsedResources are attributed to the accepted commitment shape
rather than the current spec.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b87023a4-2024-4955-ab6e-f24e2ef52a29
📒 Files selected for processing (13)
api/v1alpha1/zz_generated.deepcopy.gocmd/manager/main.gohelm/bundles/cortex-nova/values.yamlhelm/library/cortex/templates/rbac/role.yamlinternal/scheduling/reservations/commitments/api/handler.gointernal/scheduling/reservations/commitments/api/info.gointernal/scheduling/reservations/commitments/api/info_test.gointernal/scheduling/reservations/commitments/api/quota_test.gointernal/scheduling/reservations/commitments/config.gointernal/scheduling/reservations/commitments/usage.gointernal/scheduling/reservations/quota/controller.gointernal/scheduling/reservations/quota/controller_test.gointernal/scheduling/reservations/quota/integration_test.go
🚧 Files skipped from review as they are similar to previous changes (9)
- internal/scheduling/reservations/commitments/api/info.go
- helm/bundles/cortex-nova/values.yaml
- internal/scheduling/reservations/commitments/config.go
- internal/scheduling/reservations/commitments/api/info_test.go
- cmd/manager/main.go
- internal/scheduling/reservations/commitments/api/quota_test.go
- api/v1alpha1/zz_generated.deepcopy.go
- internal/scheduling/reservations/quota/controller_test.go
- internal/scheduling/reservations/quota/integration_test.go
… spec transitions Addresses CodeRabbit review feedback: during spec transitions, reading cr.Spec can bucket usage into the wrong AZ/flavor-group. Now prefers cr.Status.AcceptedSpec (the last successful reconcile snapshot) when it exists, falling back to cr.Spec only when AcceptedSpec is nil.
Test Coverage ReportTest Coverage 📊: 69.1% |
### Release digest — 2026-05-07 — [#814](#814) #### cortex v0.0.47 (sha-7d1745d8) **New features:** - `ProjectQuota` CRD with per-resource, per-AZ quota breakdown and PAYG calculation ([#796](#796)) - `FlavorGroupCapacity` CRD + background capacity controller for pre-computed per-flavor VM slot capacity per (flavor group × AZ) ([#728](#728)) - Capacity reporting in `POST /commitments/v1/report-capacity` now uses real `FlavorGroupCapacity` CRD values (replaces placeholder zeros) - CommittedResource usage reconciler — moves usage calculation into CRD status, feeding both LIQUID API and quota controller ([#800](#800)) - KVM OS version label on host capacity metrics ([#810](#810)) - KVM project usage metrics (running VMs / resource usage per project/flavor) ([#803](#803)) - `domain_id` + name on vmware project capacity metrics ([#802](#802)) - `domain_id` in vmware project commitment KPI ([#806](#806)) - Weighing explainer for scheduling decisions ([#808](#808)) **Refactors:** - KVM host capacity metric moved to infrastructure plugins package ([#809](#809)) - Deprecated per-compute KPIs removed (`flavor_running_vms`, `host_running_vms`, `resource_capacity_kvm`) ([#807](#807)) - Bundle-specific RBAC templates moved from library chart into `cortex-ironcore` / `cortex-pods` bundles ([#797](#797)) - Webhook templates moved back into `cortex-nova` ([#805](#805)) - `testlib.Ptr` replaced with native `new()` ([#801](#801)) **Fixes:** - Remove `ignoreAllocations` from kvm-report-capacity pipeline to unblock older admission webhook ([#812](#812)) - Suppress nova scheduling alerts on transient `no such host` DNS errors - Add `identity-domains` as KPI dependency - Rename hypervisor `ClusterRoleBinding` to avoid `roleRef` conflict on redeploy ([#804](#804)) #### cortex-nova v0.0.60 (sha-7d1745d8) Includes cortex v0.0.47. Adds Prometheus datasources and KPI CRD templates for KVM project usage/utilization, and updated RBAC for `FlavorGroupCapacity` + `ProjectQuota` CRDs.
No description provided.