Add reconvergence tests by luciechoi · Pull Request #935 · llvm/offload-test-suite

luciechoi · 2026-03-04T21:15:38Z

The plan is to trigger the pipeline and check all the failing tests to add XFAIL statements.

The failing tests should be investigated, added to the FailedTests.yaml, and have appropriate TODOs. Until this work is done, I plan to have the test pipeline pass regardless of the reconvergence test results.

Due to a known issue where WARP preview workflow stalls the machine indefinitely (#772) this PR disables the WARP preview builds from running automatically when modified by a PR. PRs like #935 and #769 have popped up to update/modify the WARP preview workflows and caused the machine to stall -- preventing it from running its usual scheduled jobs or other PRs.

Icohedron · 2026-03-13T22:04:18Z

Please update your branch so you don't accidentally stall one of the CI machines (see #970 for details)

Keenuts

Need to spend more time on the ballot and generation implem. Initial comments to begin with.

Keenuts · 2026-03-17T09:59:34Z

lib/API/MTL/MTLDevice.cpp

+        Err->release();
+      return 0;
+    }
+    uint32_t SubgroupSize = PSO->threadExecutionWidth();


This looks incorrect for Metal: When you create a compute pipeline state, it calculates the maximum number of threads available on the device. This value never changes, but may be different for different pipeline objects. ( https://developer.apple.com/documentation/metal/mtlcomputepipelinestate/maxtotalthreadsperthreadgroup )
It's not on the threadExecutionWidth property directly, but on the associated maxTotalThreadsPerThreadgroup, and explains why it's on the pipeline state instead of a device state (register pressure might impact it). Hence we can deduce the threadExecutionWidth is also not guaranteed to be stable across pipelines.

This should be added to the Device.h getMinMaxSubgroupSize() function: say that this in an hint for Metal at least.

Keenuts · 2026-03-17T10:14:39Z

lib/API/VK/Device.cpp

                                         VkBufferUsageFlags Usage,
                                         VkMemoryPropertyFlags MemoryFlags,
                                         size_t Size, void *Data = nullptr) {
+    const VkDeviceSize AllocationSize = std::max<size_t>(Size, 1);


Why are we calling createBuffer with a size == 0?
Shouldn't this be an error instead? (like calling malloc(0) is right)

Keenuts · 2026-03-17T10:14:44Z

lib/API/VK/Device.cpp

                                         VkBufferUsageFlags Usage,
                                         VkMemoryPropertyFlags MemoryFlags,
                                         size_t Size, void *Data = nullptr) {
+    const VkDeviceSize AllocationSize = std::max<size_t>(Size, 1);


Why are we calling createBuffer with a size == 0?
Shouldn't this be an error instead? (like calling malloc(0) is right)

Keenuts · 2026-03-17T10:16:13Z

lib/API/VK/Device.cpp

+    if (Data && Size > 0) {
      void *Dst = nullptr;
-      if (vkMapMemory(IS.Device, Memory, 0, VK_WHOLE_SIZE, 0, &Dst))
+      if (vkMapMemory(IS.Device, Memory, 0, AllocationSize, 0, &Dst))


Why not sticking with VK_WHOLE_SIZE?

Keenuts · 2026-03-17T10:24:57Z

lib/API/VK/Device.cpp

    }

-    ResourceBundle Bundle{getDescriptorType(R.Kind), R.size(), R.BufferPtr};
+    const size_t LogicalSize = R.size();


Seems like if the buffer size is zero, we should fail earlier or ignore it:

size() should be computed when we set the buffer data (Data: or FillSize:)
Hence the size should at least be large enough to hold the getElementSize, hence this change could be reverted.

Keenuts · 2026-03-17T12:37:01Z

tools/TestGenerator/reconvergence/main.cc

+  if (MaxNestingLevel < 2) {
+    std::cerr << "Error: MaxNestingLevel must be >= 2" << std::endl;
+    return 1;
+  }
+
+  std::vector<uint32_t> Counts;
+  std::stringstream CountsStream(CountsPerLevel);
+  std::string CountItem;
+  while (std::getline(CountsStream, CountItem, ',')) {
+    Counts.push_back(std::stoi(CountItem));
+  }
+
+  // Index by nesting level. Levels 0 and 1 are intentionally unused.
+  std::vector<uint32_t> TestsCountPerLevel(MaxNestingLevel + 1, 0);
+  if (Counts.size() == 1) {
+    for (uint32_t Level = 2; Level <= MaxNestingLevel; ++Level) {
+      TestsCountPerLevel[Level] = Counts[0];
+    }
+  } else {
+    // Expect one count per level from 2 to MaxNestingLevel
+    const size_t Expected = MaxNestingLevel - 1;
+    if (Counts.size() != Expected) {
+      std::cerr << "Error: Expected " << Expected << " counts (for levels 2.."
+                << MaxNestingLevel << "), got " << Counts.size() << std::endl;
+      return 1;
+    }
+    for (size_t CountIndex = 0; CountIndex < Counts.size(); ++CountIndex) {
+      TestsCountPerLevel[static_cast<uint32_t>(CountIndex) + 2] =
+          Counts[CountIndex];
+    }
+  }


Suggested change

if (MaxNestingLevel < 2) {

std::cerr << "Error: MaxNestingLevel must be >= 2" << std::endl;

return 1;

}

std::vector<uint32_t> Counts;

std::stringstream CountsStream(CountsPerLevel);

std::string CountItem;

while (std::getline(CountsStream, CountItem, ',')) {

Counts.push_back(std::stoi(CountItem));

}

// Index by nesting level. Levels 0 and 1 are intentionally unused.

std::vector<uint32_t> TestsCountPerLevel(MaxNestingLevel + 1, 0);

if (Counts.size() == 1) {

for (uint32_t Level = 2; Level <= MaxNestingLevel; ++Level) {

TestsCountPerLevel[Level] = Counts[0];

}

} else {

// Expect one count per level from 2 to MaxNestingLevel

const size_t Expected = MaxNestingLevel - 1;

if (Counts.size() != Expected) {

std::cerr << "Error: Expected " << Expected << " counts (for levels 2.."

<< MaxNestingLevel << "), got " << Counts.size() << std::endl;

return 1;

}

for (size_t CountIndex = 0; CountIndex < Counts.size(); ++CountIndex) {

TestsCountPerLevel[static_cast<uint32_t>(CountIndex) + 2] =

Counts[CountIndex];

}

}

std::vector<uint32_t> Counts;

{

std::stringstream CountsStream(CountsPerLevel);

std::string CountItem;

while (std::getline(CountsStream, CountItem, ',')) {

Counts.push_back(std::stoi(CountItem));

}

}

if (MaxNestingLevel < 2) {

std::cerr << "Error: MaxNestingLevel must be >= 2" << std::endl;

return 1;

}

if (Counts.size() != 1 || Counts.size() != MaxNestingLevel - 1) {

std::cerr << "Error: 1 or max_nesting_level - 1 counts must be given.";

return 1;

}

assert(MaxNestingLevel >= 2);

assert(Counts.size() == 1 || Counts.size() == MaxNestingLevel - 1);

std::vector<uint32_t> TestsCountPerLevel;

if (Counts.size() == 1)

TestsCountPerLevel.assign(MaxNestingLevel - 1, 0);

else

TestsCountPerLevel.assign(Counts.begin(), Counts.end());

// Level 0 and 1 are set to 0 because X Y Z

TestsCountPerLevel.insert(0, 0);

TestsCountPerLevel.insert(0, 0);

Keenuts · 2026-03-17T12:37:26Z

tools/TestGenerator/reconvergence/main.cc

+                                        /*ThreadgroupSizeX=*/7,
+                                        /*ThreadgroupSizeY=*/13);


What are those values?

Keenuts · 2026-03-17T12:37:51Z

tools/TestGenerator/reconvergence/MaskUtils.h

+
+typedef std::bitset<128> bitset_inv_t;
+
+inline Ballot waveSizeToMask(uint32_t waveSize, uint32_t /*waveCount*/) {


remove unused param

Keenuts · 2026-03-17T12:40:05Z

tools/TestGenerator/reconvergence/Ballot.h

+  }
+
+protected:
+  uint32_t m_bits;


What it this?

Keenuts · 2026-03-17T12:50:49Z

tools/TestGenerator/reconvergence/main.cc

+  reconvergence::ReconvergenceTestGenerator TestGenerator(OutputDir,
+                                                          TestExpectations);
+
+  // Wave size must be less than or equal to 128 with current ballot


Your bitset to ballot implem works on a uint64, meaning max wave size is 64, otherwise we might have issues.

luciechoi added 26 commits December 15, 2025 13:14

Sample run from randomly generated reconvergence tests

c2dd533

Add tests for 32 subgroup size

bc9a7a8

Add tests for subgroup size 16

32637d6

Fix the fill size

c525504

Remove logging for expected output

0b6527d

Add tests for subgroup sie 4

fb87fc1

Fix hlsl

4f7f3ff

Add vulkan subgroupsize restriction

458c5f0

Add subgroupsize restriction for other targets

d11a3e6

Test workflow run

791b196

Merge branch 'main' into reconvergence-workflow

9b2651f

Trigger pipeline

a85c06d

Add CMake target and clean up test files

6fd8145

Merge branch 'main' into reconvergence-workflow

f899528

Change target name and trigger on PR

d722c1b

Fix compiler errors and warnings

35cfe62

Update random number generator to use llvm's

b99a4f8

Formatting and clean up unused functions

c56c357

Update test output directory and take input arguments.

e8258a0

Segment expected output buffer

5248e03

Merge branch 'main' into reconvergence-testing

4cac424

Cleanups

1bdec28

Formatting error

2db0eef

Address pipeline run regression

6492216

Add LLVM license header

c12c356

Temporarily disable logging expected vs actual values

e404b77

luciechoi force-pushed the reconvergence-testing branch from 2b73168 to e404b77 Compare March 4, 2026 22:15

luciechoi added 3 commits March 12, 2026 19:11

Fix compiler warnings

a295769

Add skipping yaml and parser

9ebcc6d

Fix zero filled buffer allocation error. Add diff reporting suppression

1fcac66

luciechoi added 8 commits March 12, 2026 21:30

Skip printing pipeline configuration

099acdf

Fix sign int conversion warning

7c61055

replace lvalue reference with &

8e6b9f8

Replace glsl terms

f95c683

Cleanups

9248514

Fix array element size error

51c5221

More cleanups

ae28de8

Separate probabilities

2827776

luciechoi changed the title ~~[Draft] Add reconvergence tests~~ Add reconvergence tests Mar 13, 2026

luciechoi requested review from Keenuts and s-perron March 13, 2026 20:05

Icohedron mentioned this pull request Mar 13, 2026

Prevent WARP preview workflows from running automatically #970

Merged

luciechoi added 11 commits March 13, 2026 15:24

Merge branch 'main' into reconvergence-testing

c968bb7

Fix WARP wave size

7e51012

Fix CI matrix doing cartesian product over targets

66dc263

Merge branch 'main' into reconvergence-testing

9a21107

XFAIL 64 wave reconvergence tests on WARP DXC

a7dfcb7

DirectX && QC && DXC && WARP

f9bac04

DirectX && NV && DXC

f60d0cc

DirectX && ARM64 && WARP && DXC

43bc6c8

Update Metal && DXC failure

afed21c

DirectX && Intel && DXC

3db1bec

Bring back warp possible wave sizes

b810e8d

Keenuts requested changes Mar 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reconvergence tests#935

Add reconvergence tests#935
luciechoi wants to merge 48 commits intollvm:mainfrom
luciechoi:reconvergence-testing

luciechoi commented Mar 4, 2026 •

edited

Loading

Uh oh!

Icohedron commented Mar 13, 2026

Uh oh!

Keenuts left a comment

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Keenuts Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		typedef std::bitset<128> bitset_inv_t;

		inline Ballot waveSizeToMask(uint32_t waveSize, uint32_t /waveCount/) {

Conversation

luciechoi commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Icohedron commented Mar 13, 2026

Uh oh!

Keenuts left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

luciechoi commented Mar 4, 2026 •

edited

Loading