[ABLD-457] Add md5sums file generation to pkg_tar. by aiuto · Pull Request #49962 · DataDog/datadog-agent

aiuto · 2026-04-27T21:25:49Z

What does this PR do?

This change creates an enhanced version of tar_writer from rules_pkg.

drop in replacement for the original, generated from the original source as an input
with the added capability to produce an md5sums file, that is the feature we need for ABLD-457.
written in Go for increased compression speed
- the Go code is generated with claude
- It is tested for compliance by running the existing tests. This means it has some extra features that we won't need inside datadog, but it makes it easier to test against the rules_pkg source.
patches rules_pkg to use our writer instead of the default.
- review the full pkg_tar.bzl file for intent
- then validate that the patches look somewhat reasonable.
- I can upstream the capability to set a private tar_writer, but that does not block this. That change requires some more design, because the deeper need is to let people extend the calling rule with new attributes to pass to your private writer

This is step 1 of several PRs to follow. Some of these may merge if they are small enough.

Create a pkg_deb wrapper which uses this
Apply the wrapper to all the rules in //packages/...

Testing

Patch this back into upstream rules_pkg and see that tests pass

Alternatives considered

upstream this capability to rules_pkg
- that requires too much work. The maintainers (me) won't accept this limited fix. - it is a pkg_tar solution only. we would have to do pkg_zip. - it introduces go as a development language. That is a full on breaking change. We would have to first discuss and agree on a technique where the non-python version could be optional. The user would select it as a repo-rule. - an intermediate step would be to refactor pkg_tar so one could write their own, using the guts of pkg_tar_impl.
fork our own copy to a new repository
- this is feasible. It is heavyweight right now, but we can consider it if other DataDog repos want to use this.
write it in Rust instead of Go?
- distinctly possible, and we can do it at any time in the future. I think of the Go (or Rust) code as just an intermediate "assembly level" output of an agent and the Python reference implementation.

Splice our private pkg_tar into our rules_pkg dependency.

Motivation

md5sums are a packaging requirement

Describe how you validated your changes

Unit tests of the feature itself.
Patch this back into upstream rules_pkg and verify that the original tests pass

datadog-prod-us1-4 · 2026-04-27T21:48:51Z

🎯 Code Coverage (details)
• Patch Coverage: 100.00%
• Overall Coverage: 50.18% (-3.51%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 369c344 | Docs | Datadog PR Page | Give us feedback!}

dd-octo-sts · 2026-04-27T21:56:45Z

Files inventory check summary

File checks results against ancestor 043426bc:

Results for datadog-agent_7.80.0~devel.git.320.369c344.pipeline.110330363-1_amd64.deb:

No change detected

dd-octo-sts · 2026-04-27T22:08:38Z

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor 043426b
📊 Static Quality Gates Dashboard
🔗 SQG Job

31 successful checks with minimal change (< 2 KiB)

	Quality gate	Current Size
✅	agent_deb_amd64	739.407 MiB
✅	agent_deb_amd64_fips	697.822 MiB
✅	agent_heroku_amd64	309.134 MiB
✅	agent_msi	606.983 MiB
✅	agent_rpm_amd64	739.391 MiB
✅	agent_rpm_amd64_fips	697.806 MiB
✅	agent_rpm_arm64	717.449 MiB
✅	agent_rpm_arm64_fips	678.912 MiB
✅	agent_suse_amd64	739.391 MiB
✅	agent_suse_amd64_fips	697.806 MiB
✅	agent_suse_arm64	717.449 MiB
✅	agent_suse_arm64_fips	678.912 MiB
✅	docker_agent_amd64	799.862 MiB
✅	docker_agent_arm64	802.733 MiB
✅	docker_agent_jmx_amd64	990.781 MiB
✅	docker_agent_jmx_arm64	982.431 MiB
✅	docker_cluster_agent_amd64	206.387 MiB
✅	docker_cluster_agent_arm64	220.508 MiB
✅	docker_cws_instrumentation_amd64	7.142 MiB
✅	docker_cws_instrumentation_arm64	6.689 MiB
✅	docker_dogstatsd_amd64	39.347 MiB
✅	docker_dogstatsd_arm64	37.565 MiB
✅	dogstatsd_deb_amd64	30.005 MiB
✅	dogstatsd_deb_arm64	28.146 MiB
✅	dogstatsd_rpm_amd64	30.005 MiB
✅	dogstatsd_suse_amd64	30.005 MiB
✅	iot_agent_deb_amd64	44.400 MiB
✅	iot_agent_deb_arm64	41.388 MiB
✅	iot_agent_deb_armhf	42.120 MiB
✅	iot_agent_rpm_amd64	44.400 MiB
✅	iot_agent_suse_amd64	44.400 MiB

On-wire sizes (compressed)

	Quality gate	Change	Size (prev → curr → max)
✅	agent_deb_amd64	+19.04 KiB (0.01% increase)	174.998 → 175.017 → 179.160
✅	agent_deb_amd64_fips	-9.35 KiB (0.01% reduction)	166.778 → 166.769 → 174.440
✅	agent_heroku_amd64	+5.84 KiB (0.01% increase)	74.915 → 74.921 → 80.310
✅	agent_msi	+16.0 KiB (0.01% increase)	140.387 → 140.402 → 148.730
✅	agent_rpm_amd64	+57.4 KiB (0.03% increase)	177.011 → 177.067 → 182.080
✅	agent_rpm_amd64_fips	+9.08 KiB (0.01% increase)	168.156 → 168.165 → 174.140
✅	agent_rpm_arm64	+20.27 KiB (0.01% increase)	159.203 → 159.223 → 163.610
✅	agent_rpm_arm64_fips	-18.38 KiB (0.01% reduction)	151.520 → 151.502 → 156.850
✅	agent_suse_amd64	+57.4 KiB (0.03% increase)	177.011 → 177.067 → 182.080
✅	agent_suse_amd64_fips	+9.08 KiB (0.01% increase)	168.156 → 168.165 → 174.140
✅	agent_suse_arm64	+20.27 KiB (0.01% increase)	159.203 → 159.223 → 163.610
✅	agent_suse_arm64_fips	-18.38 KiB (0.01% reduction)	151.520 → 151.502 → 156.850
✅	docker_agent_amd64	+3.6 KiB (0.00% increase)	267.235 → 267.239 → 272.990
✅	docker_agent_arm64	+3.02 KiB (0.00% increase)	254.261 → 254.264 → 261.470
✅	docker_agent_jmx_amd64	+8.54 KiB (0.00% increase)	335.885 → 335.894 → 341.610
✅	docker_agent_jmx_arm64	+25.51 KiB (0.01% increase)	318.897 → 318.922 → 326.050
✅	docker_cluster_agent_amd64	neutral	72.336 MiB → 73.460
✅	docker_cluster_agent_arm64	neutral	67.796 MiB → 68.680
✅	docker_cws_instrumentation_amd64	neutral	2.999 MiB → 3.330
✅	docker_cws_instrumentation_arm64	neutral	2.729 MiB → 3.090
✅	docker_dogstatsd_amd64	neutral	15.231 MiB → 15.870
✅	docker_dogstatsd_arm64	neutral	14.545 MiB → 14.890
✅	dogstatsd_deb_amd64	neutral	7.936 MiB → 8.830
✅	dogstatsd_deb_arm64	neutral	6.818 MiB → 7.750
✅	dogstatsd_rpm_amd64	neutral	7.946 MiB → 8.840
✅	dogstatsd_suse_amd64	neutral	7.946 MiB → 8.840
✅	iot_agent_deb_amd64	neutral	11.685 MiB → 13.210
✅	iot_agent_deb_arm64	neutral	9.988 MiB → 11.620
✅	iot_agent_deb_armhf	neutral	10.191 MiB → 11.780
✅	iot_agent_rpm_amd64	neutral	11.703 MiB → 13.230
✅	iot_agent_suse_amd64	neutral	11.703 MiB → 13.230

cit-pr-commenter-54b7da · 2026-04-27T22:21:59Z

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: a43134c4-4320-4692-8c0a-3f89a5797284

Baseline: bc931dd
Comparison: 369c344
Diff

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	docker_containers_cpu	% cpu utilization	-1.82	[-4.72, +1.09]	1	Logs

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	quality_gate_metrics_logs	memory utilization	+1.94	[+1.68, +2.19]	1	Logs bounds checks dashboard
➖	tcp_syslog_to_blackhole	ingress throughput	+1.57	[+1.36, +1.77]	1	Logs
➖	quality_gate_logs	% cpu utilization	+1.43	[+0.46, +2.40]	1	Logs bounds checks dashboard
➖	ddot_metrics_sum_cumulative	memory utilization	+1.14	[+0.98, +1.30]	1	Logs
➖	ddot_logs	memory utilization	+0.58	[+0.52, +0.65]	1	Logs
➖	docker_containers_memory	memory utilization	+0.46	[+0.36, +0.56]	1	Logs
➖	otlp_ingest_logs	memory utilization	+0.38	[+0.29, +0.47]	1	Logs
➖	file_tree	memory utilization	+0.09	[+0.04, +0.14]	1	Logs
➖	quality_gate_idle	memory utilization	+0.03	[-0.02, +0.08]	1	Logs bounds checks dashboard
➖	file_to_blackhole_1000ms_latency	egress throughput	+0.02	[-0.44, +0.47]	1	Logs
➖	uds_dogstatsd_to_api	ingress throughput	+0.01	[-0.19, +0.21]	1	Logs
➖	file_to_blackhole_500ms_latency	egress throughput	+0.01	[-0.39, +0.41]	1	Logs
➖	uds_dogstatsd_to_api_v3	ingress throughput	-0.01	[-0.20, +0.19]	1	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	-0.01	[-0.11, +0.09]	1	Logs
➖	file_to_blackhole_100ms_latency	egress throughput	-0.02	[-0.12, +0.09]	1	Logs
➖	file_to_blackhole_0ms_latency	egress throughput	-0.04	[-0.58, +0.50]	1	Logs
➖	ddot_metrics_sum_cumulativetodelta_exporter	memory utilization	-0.07	[-0.31, +0.16]	1	Logs
➖	uds_dogstatsd_20mb_12k_contexts_20_senders	memory utilization	-0.10	[-0.15, -0.05]	1	Logs
➖	quality_gate_idle_all_features	memory utilization	-0.13	[-0.17, -0.09]	1	Logs bounds checks dashboard
➖	ddot_metrics	memory utilization	-0.19	[-0.39, +0.02]	1	Logs
➖	otlp_ingest_metrics	memory utilization	-0.30	[-0.46, -0.15]	1	Logs
➖	ddot_metrics_sum_delta	memory utilization	-0.34	[-0.53, -0.16]	1	Logs
➖	docker_containers_cpu	% cpu utilization	-1.82	[-4.72, +1.09]	1	Logs

Bounds Checks: ✅ Passed

perf	experiment	bounds_check_name	replicates_passed	observed_value	links
✅	docker_containers_cpu	simple_check_run	10/10	664 ≥ 26
✅	docker_containers_memory	memory_usage	10/10	241.38MiB ≤ 370MiB
✅	docker_containers_memory	simple_check_run	10/10	699 ≥ 26
✅	file_to_blackhole_0ms_latency	memory_usage	10/10	0.16GiB ≤ 1.20GiB
✅	file_to_blackhole_0ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_1000ms_latency	memory_usage	10/10	0.20GiB ≤ 1.20GiB
✅	file_to_blackhole_1000ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_100ms_latency	memory_usage	10/10	0.17GiB ≤ 1.20GiB
✅	file_to_blackhole_100ms_latency	missed_bytes	10/10	0B = 0B
✅	file_to_blackhole_500ms_latency	memory_usage	10/10	0.18GiB ≤ 1.20GiB
✅	file_to_blackhole_500ms_latency	missed_bytes	10/10	0B = 0B
✅	quality_gate_idle	intake_connections	10/10	3 ≤ 4	bounds checks dashboard
✅	quality_gate_idle	memory_usage	10/10	144.87MiB ≤ 147MiB	bounds checks dashboard
✅	quality_gate_idle_all_features	intake_connections	10/10	3 ≤ 4	bounds checks dashboard
✅	quality_gate_idle_all_features	memory_usage	10/10	467.53MiB ≤ 495MiB	bounds checks dashboard
✅	quality_gate_logs	intake_connections	10/10	4 ≤ 6	bounds checks dashboard
✅	quality_gate_logs	memory_usage	10/10	177.51MiB ≤ 195MiB	bounds checks dashboard
✅	quality_gate_logs	missed_bytes	10/10	0B = 0B	bounds checks dashboard
✅	quality_gate_metrics_logs	cpu_usage	10/10	352.97 ≤ 2000	bounds checks dashboard
✅	quality_gate_metrics_logs	intake_connections	10/10	3 ≤ 6	bounds checks dashboard
✅	quality_gate_metrics_logs	memory_usage	10/10	379.59MiB ≤ 430MiB	bounds checks dashboard
✅	quality_gate_metrics_logs	missed_bytes	10/10	0B = 0B	bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

CI Pass/Fail Decision

✅ Passed. All Quality Gates passed.

quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.

This change creates an enhanced version of tar_writer from rules_pkg. - drop in replacement for the original, generated from the original source as an input - with the added capability to produce an md5sums file - written in Go for increased compression speed - patch rules_pkg to use our writer instead of the default. This is step 1 of several PRs to follow. Some of these may merge if they are small enough. - Create a pkg_deb wrapper which uses this - Apply the wrapper to all the rules in //packages/... **Testing** Patch this back into upstream rules_pkg and see that tests pass - https://github.com/aiuto/rules_pkg/tree/dd_abld_457 - bazelbuild/rules_pkg#1060 **Alternatives considered** 1. upstream this capability to rules_pkg - that requires too much work. The maintainers (me) won't accept this limited fix. - it is a pkg_tar solution only. we would have to do pkg_zip. - it introduces go as a development language. That is a full on breaking change. We would have to first discuss and agree on a technique where the non-python version could be optional. The user would select it as a repo-rule. - an intermediate step would be to refactor pkg_tar so one could write their own, using the guts of pkg_tar_impl. 2. fork our own copy to a new repository - this is feasible. It is heavyweight right now, but we can consider it if other DataDog repos want to use this. 3. write it in Rust instead of Go? - distinctly possible, and we can do it at any time in the future

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8885eb7004

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

aiuto added changelog/no-changelog No changelog entry needed qa/no-code-change No code change in Agent code requiring validation labels Apr 27, 2026

dd-octo-sts Bot added the internal Identify a non-fork PR label Apr 27, 2026

github-actions Bot added the long review PR is complex, plan time to review it label Apr 27, 2026

dd-octo-sts Bot added team/agent-build team/agent-devx labels Apr 27, 2026

aiuto force-pushed the aiuto/mani branch from 07d6ffa to 0cd7f68 Compare April 28, 2026 02:27

aiuto added 2 commits April 27, 2026 22:42

the license tool is wonky

109787a

remove brittles

8885eb7

aiuto marked this pull request as ready for review April 28, 2026 03:11

aiuto requested a review from a team as a code owner April 28, 2026 03:11

chatgpt-codex-connector Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread bazel/rules/dd_tar_writer/main.go

Comment thread bazel/rules/dd_tar_writer/main.go

aiuto added 2 commits April 28, 2026 16:40

Merge branch 'main' into aiuto/mani

9546b01

Merge branch 'main' into aiuto/mani

369c344

aiuto added the ask-review Ask required teams to review this PR label Apr 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ABLD-457] Add md5sums file generation to pkg_tar.#49962

[ABLD-457] Add md5sums file generation to pkg_tar.#49962
aiuto wants to merge 5 commits intomainfrom
aiuto/mani

aiuto commented Apr 27, 2026 •

edited

Loading

Uh oh!

datadog-prod-us1-4 Bot commented Apr 27, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading

Uh oh!

dd-octo-sts Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

dd-octo-sts Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

cit-pr-commenter-54b7da Bot commented Apr 27, 2026 •

edited

Loading

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aiuto commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Describe how you validated your changes

Uh oh!

datadog-prod-us1-4 Bot commented Apr 27, 2026 • edited by datadog-datadog-prod-us1-2 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dd-octo-sts Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Files inventory check summary

Results for datadog-agent_7.80.0~devel.git.320.369c344.pipeline.110330363-1_amd64.deb:

Uh oh!

dd-octo-sts Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Static quality checks

Uh oh!

cit-pr-commenter-54b7da Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression Detector

Regression Detector Results

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Fine details of change detection per experiment

Bounds Checks: ✅ Passed

Explanation

CI Pass/Fail Decision

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aiuto commented Apr 27, 2026 •

edited

Loading

datadog-prod-us1-4 Bot commented Apr 27, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading

dd-octo-sts Bot commented Apr 27, 2026 •

edited

Loading

dd-octo-sts Bot commented Apr 27, 2026 •

edited

Loading

cit-pr-commenter-54b7da Bot commented Apr 27, 2026 •

edited

Loading