perf(ci): disable ASLR for benchmark to reduce result variance by starwarfan · Pull Request #385 · DTVMStack/DTVM

starwarfan · 2026-03-05T09:38:28Z

Summary

Disable ASLR during performance benchmark runs in CI to eliminate the ***WARNING*** ASLR is enabled, the results may have unreproducible noise in them. message and reduce variance in regression results.

Changes

Wrap bash .ci/run_test_suite.sh with setarch x86_64 -R in the performance regression check job
setarch x86_64 -R runs the benchmark process with ADDR_NO_RANDOMIZE personality, disabling ASLR for that process only (no system-wide change, no root required)

Motivation

Google Benchmark reports ASLR as a source of unreproducible noise. Disabling it for the benchmark process should make results more stable on shared GitHub Actions runners.

Made with Cursor

Tune the underlying Google Benchmark parameters in CI to reduce test variance in shared GitHub Actions environments: - Expose --benchmark-min-time to check_performance_regression.py. - Pass `--benchmark_enable_random_interleaving=true` when repetitions > 1. - Provide `BENCHMARK_REPETITIONS` and `BENCHMARK_MIN_TIME` in CI env. - Increase benchmark repetitions to 5 and min-time to 2s in workflows. These changes smooth out short-term CPU spikes and filter out noise caused by intermittent CI noisy neighbors. Made-with: Cursor

Use setarch x86_64 -R to run evmone-bench with ASLR disabled, eliminating the 'ASLR is enabled' warning and reducing unreproducible noise in performance regression results. Made-with: Cursor

Copilot

Pull request overview

This PR reduces benchmark result variance in CI by applying several noise-reduction techniques: disabling ASLR via setarch x86_64 -R, enabling random interleaving of benchmark repetitions, increasing the number of repetitions from 3 to 5, and setting a minimum benchmark execution time of 2 seconds.

Changes:

Wraps the benchmark CI step with setarch x86_64 -R to disable ASLR, and sets BENCHMARK_REPETITIONS=5 and BENCHMARK_MIN_TIME=2s as environment variables
Adds --benchmark-min-time CLI argument to check_performance_regression.py and enables --benchmark_enable_random_interleaving=true when repetitions > 1
Plumbs the new BENCHMARK_REPETITIONS and BENCHMARK_MIN_TIME environment variables through run_test_suite.sh to the Python benchmark script

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`.github/workflows/dtvm_evm_test_x86.yml`	Sets `BENCHMARK_REPETITIONS=5`, `BENCHMARK_MIN_TIME=2s`, and wraps benchmark run with `setarch x86_64 -R`
`.ci/run_test_suite.sh`	Reads new env vars, constructs `PERF_ARGS`, and passes them to all `check_performance_regression.py` invocations
`tools/check_performance_regression.py`	Adds `--benchmark-min-time` argument, passes it to Google Benchmark, and adds `--benchmark_enable_random_interleaving=true`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-05T09:50:21Z

.github/workflows/dtvm_evm_test_x86.yml

+          export BENCHMARK_REPETITIONS=5
+          export BENCHMARK_MIN_TIME=2s

-          bash .ci/run_test_suite.sh
+          setarch x86_64 -R bash .ci/run_test_suite.sh


The PR description only mentions disabling ASLR via setarch x86_64 -R, but the actual changes are broader:

Adds --benchmark_enable_random_interleaving=true to the benchmark command

Adds a new --benchmark-min-time CLI argument and plumbing

Increases BENCHMARK_REPETITIONS from 3 to 5

Sets BENCHMARK_MIN_TIME=2s

These are all reasonable variance-reduction measures, but the description should be updated to reflect the full scope of changes so reviewers and future readers understand the intent.

Copilot · 2026-03-05T09:50:21Z

.ci/run_test_suite.sh

@@ -208,15 +218,15 @@ for STACK_TYPE in ${STACK_TYPES[@]}; do
                echo "Running baseline benchmarks with library from base branch..."
                cp "$BENCHMARK_BASELINE_LIB"/libdtvmapi.so ./libdtvmapi.so
                SAVE_PATH=${BASELINE_CACHE:-/tmp/perf_baseline.json}
-                python3 check_performance_regression.py \
+                python3 check_performance_regression.py $PERF_ARGS \
                    --save-baseline "$SAVE_PATH" \
                    --lib ./libdtvmapi.so \
                    --mode "$BENCHMARK_MODE" \
                    --benchmark-dir test/evm-benchmarks/benchmarks

                echo "Running current benchmarks with PR library..."
                cp ../build/lib/libdtvmapi.so ./libdtvmapi.so
-                python3 check_performance_regression.py \
+                python3 check_performance_regression.py $PERF_ARGS \
                    --baseline "$SAVE_PATH" \
                    --threshold "$BENCHMARK_THRESHOLD" \
                    --output-summary "$BENCHMARK_SUMMARY_FILE" \
@@ -225,15 +235,15 @@ for STACK_TYPE in ${STACK_TYPES[@]}; do
                    --benchmark-dir test/evm-benchmarks/benchmarks
            elif [ -n "$BENCHMARK_SAVE_BASELINE" ]; then
                echo "Saving performance baseline..."
-                python3 check_performance_regression.py \
+                python3 check_performance_regression.py $PERF_ARGS \
                    --save-baseline "$BENCHMARK_SAVE_BASELINE" \
                    --output-summary "$BENCHMARK_SUMMARY_FILE" \
                    --lib ./libdtvmapi.so \
                    --mode "$BENCHMARK_MODE" \
                    --benchmark-dir test/evm-benchmarks/benchmarks
            elif [ -n "$BENCHMARK_BASELINE_FILE" ]; then
                echo "Checking performance regression against baseline..."
-                python3 check_performance_regression.py \
+                python3 check_performance_regression.py $PERF_ARGS \
                    --baseline "$BENCHMARK_BASELINE_FILE" \
                    --threshold "$BENCHMARK_THRESHOLD" \
                    --output-summary "$BENCHMARK_SUMMARY_FILE" \
@@ -242,7 +252,7 @@ for STACK_TYPE in ${STACK_TYPES[@]}; do
                    --benchmark-dir test/evm-benchmarks/benchmarks
            else
                echo "Running benchmark suite without comparison..."
-                python3 check_performance_regression.py \
+                python3 check_performance_regression.py $PERF_ARGS \


Environment variables like BENCHMARK_REPETITIONS and BENCHMARK_MIN_TIME are concatenated into PERF_ARGS and then expanded unquoted in the python3 check_performance_regression.py $PERF_ARGS invocations in this block. If an attacker can influence these environment variables (e.g., via CI inputs or environment configuration), they could inject shell metacharacters such as ; or backticks and cause arbitrary commands to execute in the CI environment. To mitigate this, strictly validate these variables (for example, ensuring they are numeric-only) and pass them as separate, properly quoted arguments or via a shell array instead of constructing a raw argument string that is expanded unquoted.

github-actions · 2026-03-05T09:50:44Z

⚡ Performance Regression Check Results

⚠️ Performance Regression Detected (interpreter)

No benchmark summary available.

⚠️ Performance Regression Detected (multipass)

No benchmark summary available.

starwarfan added 2 commits March 5, 2026 09:27

perf(ci): disable ASLR for benchmark to reduce result variance

bb73b88

Use setarch x86_64 -R to run evmone-bench with ASLR disabled, eliminating the 'ASLR is enabled' warning and reducing unreproducible noise in performance regression results. Made-with: Cursor

zoowii requested a review from Copilot March 5, 2026 09:43

Copilot started reviewing on behalf of zoowii March 5, 2026 09:44 View session

starwarfan marked this pull request as draft March 5, 2026 09:46

starwarfan closed this Mar 5, 2026

Copilot AI reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(ci): disable ASLR for benchmark to reduce result variance#385

perf(ci): disable ASLR for benchmark to reduce result variance#385
starwarfan wants to merge 2 commits intoDTVMStack:mainfrom
starwarfan:perf-ci-disable-aslr

starwarfan commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

starwarfan commented Mar 5, 2026

Summary

Changes

Motivation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 5, 2026

⚡ Performance Regression Check Results

⚠️ Performance Regression Detected (interpreter)

⚠️ Performance Regression Detected (multipass)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants