Fix benchmark exploit via object-identity caching #102

msaroufim · 2026-02-07T00:01:44Z

Summary

Fixes vulnerability where submissions could cache results based on Python object identity (id(tensor))

Changes

Clone data before each timing iteration (outside the timed region) - gives fresh object identities while not affecting measured kernel time
Use local seed variable instead of mutating test.args["seed"] - avoids shared mutable state

The benchmark harness was vulnerable to submissions that cache results based on Python object identity (e.g., id(tensor)). Since the same data objects were reused across all timing iterations, a submission could cache on first call and return cached results on subsequent calls, showing artificial speedups of 12-36%. Changes: - Clone data before each timing iteration (outside the timed region) to give each iteration fresh object identities while not affecting measured kernel time - Use local seed variable instead of mutating test.args["seed"] to avoid shared mutable state between benchmark runs

Additional hardening on top of the object-identity caching fix: - Shuffle data order each timing iteration to prevent call-count caching (a submission could track invocation count and predict which data item appears at each position) - Move clone before torch.cuda.synchronize() so clone GPU copies can overlap with previous iteration's tail work - Fix pre-existing recheck bug where only the last item's correctness was checked (if not good was outside the for loop) - Use shuffle_order indices to correctly pair shuffled outputs with their reference data during recheck

msaroufim added 2 commits February 6, 2026 16:01

msaroufim requested review from S1ro1, alexzhang13 and ngc92 February 7, 2026 03:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix benchmark exploit via object-identity caching #102

Fix benchmark exploit via object-identity caching #102

Uh oh!

msaroufim commented Feb 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix benchmark exploit via object-identity caching #102

Are you sure you want to change the base?

Fix benchmark exploit via object-identity caching #102

Uh oh!

Conversation

msaroufim commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

msaroufim commented Feb 7, 2026 •

edited

Loading