Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 59 additions & 3 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,20 +36,33 @@ Cortex receives the list of possible hosts and their weights from Nova. It then

As part of the [CobaltCore](https://cobaltcore-dev.github.io/docs/) stack, we provide a Placement-like API shim, which translates requests from Nova and Neutron to the [Hypervisor CRD](https://github.com/cobaltcore-dev/openstack-hypervisor-operator) based on the KVM stack provided by [IronCore](https://ironcore.dev/), [Gardener](https://gardener.cloud/) and [Garden Linux](https://gardenlinux.io/). This means, instead of managing resource inventories in Placement's database, the Hypervisor CRD is used to track resource allocations and hypervisor capabilities.

### Feature Flags

Each major capability of the shim is gated behind a feature flag in the Helm configuration. When a flag is disabled, the corresponding endpoints fall back to forwarding requests to upstream Placement unchanged. This allows operators to adopt CRD-backed behavior incrementally.

| Flag | Endpoints affected | Behavior when enabled |
|---|---|---|
| `features.enableResourceProviders` | `/resource_providers` and sub-resources | Serve KVM resource providers from Hypervisor CRDs; merge with upstream for non-KVM providers |
| `features.enableRoot` | `GET /` | Return a static version discovery document from config instead of forwarding to upstream |
| `features.enableTraits` | `/traits` | Serve traits from local ConfigMaps instead of upstream Placement |

### Passthrough

Placement maintains hypervisors of various kinds, such as [Ironic](https://github.com/openstack/ironic) or VMware vCenter Servers, not only KVM. However, only KVM hypervisors can be managed by the Cortex Placement API Shim. This means, when Nova or Neutron ask for VMware or Ironic resource providers, the shim needs to forward this request to another Placement instance. We call this the passthrough, and it looks like this:

```mermaid
graph LR;
nn(OpenStack Nova/Neutron) <--> api
nn(OpenStack Nova/Neutron) <--> auth
subgraph shim [Cortex Placement API Shim]
api(API) <--> router(Routing and Aggregation)
router <-- KVM (QEMU/CH) --> tl
auth(Auth Middleware) <--> api(API)
api <--> router(Routing and Aggregation)
router <-- KVM --> tl
tl(Translation)
end
auth <-.-> ks(OpenStack Keystone)
router <-- VMware/Ironic --> pl(OpenStack Placement)
tl <--> crd(Hypervisor CRD)
tl <--> cm(Traits ConfigMaps)
```

After a request was received by the API, it is processed in two ways depending on the kind of endpoint that was requested:
Expand All @@ -58,3 +71,46 @@ After a request was received by the API, it is processed in two ways depending o
2. **Per-request forwarding**: For requests that ask for a specific resource provider, such as `GET /resource_providers/{uuid}`, the shim needs to determine if the requested resource provider is managed by the KVM translation or the passthrough. This can be done by checking the UUID of the resource provider against a list of known KVM resource providers. If it is a KVM resource provider, the request is forwarded to the translation; otherwise, it is forwarded to the OpenStack Placement instance.

The translation layer is responsible for translating the requests and responses between the OpenStack Placement API and the Hypervisor CRD. This includes mapping resource provider attributes, inventory, and allocations to the corresponding fields in the Hypervisor CRD.

Upstream connectivity is optional at startup: if the upstream Placement API is unreachable, the shim logs a warning and continues booting. This allows the shim to operate in a standalone CRD-backed mode when upstream is not available.

### CRD-Backed Resource Providers

When `features.enableResourceProviders` is enabled, the shim serves KVM resource providers directly from Kubernetes Hypervisor CRDs rather than forwarding to upstream Placement. This is the core architectural shift: KVM hypervisor inventory lives in Kubernetes instead of in Placement's database.

The shim supports the full CRUD surface for resource providers:

- **GET /resource_providers**: Lists resource providers by merging KVM hypervisors from Kubernetes with non-KVM providers from upstream Placement. The merge is based on UUID: if a hypervisor CRD exists with the same OpenStack ID as an upstream provider, the CRD-backed version takes precedence.
- **GET /resource_providers/{uuid}**: Looks up the UUID against indexed Hypervisor CRDs first. If found, returns the translated provider; otherwise, forwards to upstream.
- **POST /resource_providers**: Checks the requested name and UUID against existing Hypervisor CRDs. Returns `409 Conflict` if the name or UUID collides with a KVM hypervisor, preventing shadow providers from being created in upstream Placement. If no collision, the request is forwarded to upstream.
- **PUT /resource_providers/{uuid}**: Same collision detection as POST. Updates that would rename a KVM-managed provider are rejected with `409 Conflict`.
- **DELETE /resource_providers/{uuid}**: Prevents deletion of CRD-backed KVM providers by returning `409 Conflict`. Non-KVM providers are forwarded to upstream.

For efficient lookups, the shim indexes Hypervisor CRDs on three fields: `status.hypervisorId` (the OpenStack UUID), `metadata.uid` (the Kubernetes UID), and `metadata.name`. These indexes are registered at startup via the multicluster client, enabling O(1) lookups by any of these keys.

### Traits

When `features.enableTraits` is enabled, the shim serves OpenStack Placement traits from a pair of Kubernetes ConfigMaps instead of forwarding to upstream:

- **Static ConfigMap** (Helm-managed): Contains the standard OpenStack traits deployed via Helm. Its name is set by `traits.configMapName` in the shim config.
- **Custom ConfigMap** (shim-managed): Stores `CUSTOM_*` traits created at runtime through PUT requests. Named `{configMapName}-custom`.

The trait endpoints support the full OpenStack Placement traits API:
- `GET /traits` returns a sorted, merged list from both ConfigMaps, with optional filtering via the `name` query parameter (`in:TRAIT_A,TRAIT_B` or `startswith:CUSTOM_`).
- `GET /traits/{name}` checks both ConfigMaps for existence.
- `PUT /traits/{name}` creates custom traits (only `CUSTOM_*` prefixed names are allowed).
- `DELETE /traits/{name}` removes custom traits.

Writes to the custom ConfigMap are serialized across replicas using a Kubernetes Lease-backed distributed lock (see `pkg/resourcelock`). This prevents concurrent writes from corrupting the ConfigMap data.

### Authentication

The shim includes an optional Keystone token validation middleware, configured via the `auth` section in the Helm values. When enabled, every incoming request is checked against a policy table before reaching the handler.

**Policy evaluation** is first-match: each policy rule specifies an HTTP method and path pattern (e.g., `GET /usages`, `* /*`) and the roles that grant access. If no policy matches the request, it is denied with `403 Forbidden`. Policies with an empty roles list mark the path as publicly accessible.

**Role-based access** supports two scoping modes:
- **Unscoped**: The token must contain the named role, regardless of project.
- **Project-scoped**: The token's project ID must match a project ID extracted from the request. The project ID can be extracted from a URL query parameter or a top-level JSON body field, configurable per role.

**Token caching**: Validated tokens are cached in memory with SHA-256 hashed keys and a configurable TTL (default 5 minutes). The cache uses `singleflight` to deduplicate concurrent introspection calls for the same token, avoiding thundering-herd problems when many requests arrive with the same token simultaneously.
Loading