Skip to content

Conversation

@bnaecker
Copy link
Collaborator

This updates maghemite, dendrite, and OPTE for the run-up to the IPv6 integration and attached subnet work. It does not actually use any new functionality, but makes a few small API changes required for the OPTE update.

@bnaecker bnaecker marked this pull request as draft January 30, 2026 05:27
@bnaecker
Copy link
Collaborator Author

bnaecker commented Jan 30, 2026

This is likely going to fail until we have CI images on the Buildomat hosts with recent-enough OPTE bits, version 0.39.455. That is in the works per Señor Clulow here

@bnaecker bnaecker force-pushed the update-networking-deps branch from 79979b1 to 9b6b8c6 Compare January 30, 2026 05:31
@bnaecker
Copy link
Collaborator Author

There is a large number of failing tests, mostly timeouts in the multicast stuff. I haven't root-caused that yet, but the actual Helios deploy job that uses the new image with OPTE 0.39 did in fact work.

@bnaecker
Copy link
Collaborator Author

bnaecker commented Jan 30, 2026

Ok, this overlaps with #9746. Also, all the multicast tests are failing because we're running against a version of Dendrite that doesn't have those endpoints implemented.

As as side note, we can see that the logs are in fact getting a bunch of fatal errors, e.g., from here:

test_cache_ttl_behavior: DPD not ready yet
    error = Error Response: status: 501 Not Implemented; headers: {"content-type": "application/json", "x-request-id": "7e4ace1e-df6f-4884-a950-8ee7e751b5cd", "content-length": "101", "date": "Fri, 30 Jan 2026 18:06:17 GMT"}; value: Error { error_code: None, message: "multicast feature disabled", request_id: "7e4ace1e-df6f-4884-a950-8ee7e751b5cd" }

That's because we're catching absolutely any error in talking to DPD and assuming it'll resolve:

Err(e) => {
debug!(
log,
"DPD not ready yet";
"error" => %e
);
Err(CondCheckError::<String>::NotYet)

This one definitely will never resolve. We should probably check the status code there and be more selective. There's really only a few errors that are retryable, 503s for example, and so most should cause us to bail here.

@Nieuwejaar
Copy link
Contributor

Ok, this overlaps with #9746. Also, all the multicast tests are failing because we're running against a version of Dendrite that doesn't have those endpoints implemented.

As as side note, we can see that the logs are in fact getting a bunch of fatal errors, e.g., from here:

test_cache_ttl_behavior: DPD not ready yet
    error = Error Response: status: 501 Not Implemented; headers: {"content-type": "application/json", "x-request-id": "7e4ace1e-df6f-4884-a950-8ee7e751b5cd", "content-length": "101", "date": "Fri, 30 Jan 2026 18:06:17 GMT"}; value: Error { error_code: None, message: "multicast feature disabled", request_id: "7e4ace1e-df6f-4884-a950-8ee7e751b5cd" }

That's because we're catching absolutely any error in talking to DPD and assuming it'll resolve:

Err(e) => {
debug!(
log,
"DPD not ready yet";
"error" => %e
);
Err(CondCheckError::<String>::NotYet)

This one definitely will never resolve. We should probably check the status code there and be more selective. There's really only a few errors that are retryable, 503s for example, and so most should cause us to bail here.

#9746 is now integrated. It puts the multicast tests behind a feature flag, so you shouldn't hit those anymore.

@bnaecker bnaecker force-pushed the update-networking-deps branch 3 times, most recently from c797f51 to 66c9cf8 Compare January 30, 2026 22:31
@bnaecker
Copy link
Collaborator Author

Hit the clippy lint Nils fixed in #9763, will rebase on that.

This updates maghemite, dendrite, and OPTE for the run-up to the IPv6
integration and attached subnet work. It does not actually use any new
functionality, but makes a few small API changes required for the OPTE
update.
@bnaecker bnaecker force-pushed the update-networking-deps branch from 66c9cf8 to 76eec9d Compare January 31, 2026 04:07
@bnaecker bnaecker marked this pull request as ready for review January 31, 2026 14:30
.iter()
.map(|ip| {
let cidr =
Ipv4Cidr::new(ip.addr().into(), ip.width().try_into().unwrap());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for the review. Looks like this unwrap can only happen if oxnet returns a prefix length greater than 32 which should not be possible. But I think this is a sign the prefix length type needs to be stronger, or we use oxnet types in the OPTE interface, or both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be very nice to commonize all these. That was the point of oxnet, but we need to make it no_std and probably do a lot more work to have it fit nicely in both places.

@bnaecker bnaecker merged commit 30bae5d into main Jan 31, 2026
18 checks passed
@bnaecker bnaecker deleted the update-networking-deps branch January 31, 2026 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants