Skip to content

Add reconvergence tests#935

Open
luciechoi wants to merge 48 commits intollvm:mainfrom
luciechoi:reconvergence-testing
Open

Add reconvergence tests#935
luciechoi wants to merge 48 commits intollvm:mainfrom
luciechoi:reconvergence-testing

Conversation

@luciechoi
Copy link
Contributor

@luciechoi luciechoi commented Mar 4, 2026

Part of llvm/llvm-project#136930

The plan is to trigger the pipeline and check all the failing tests to add XFAIL statements.

The failing tests should be investigated, added to the FailedTests.yaml, and have appropriate TODOs. Until this work is done, I plan to have the test pipeline pass regardless of the reconvergence test results.

@luciechoi luciechoi force-pushed the reconvergence-testing branch from 2b73168 to e404b77 Compare March 4, 2026 22:15
@luciechoi luciechoi changed the title [Draft] Add reconvergence tests Add reconvergence tests Mar 13, 2026
@luciechoi luciechoi requested review from Keenuts and s-perron March 13, 2026 20:05
Icohedron added a commit that referenced this pull request Mar 13, 2026
Due to a known issue where WARP preview workflow stalls the machine
indefinitely (#772)
this PR disables the WARP preview builds from running automatically when
modified by a PR.

PRs like #935 and
#769 have popped up to
update/modify the WARP preview workflows and caused the machine to stall
-- preventing it from running its usual scheduled jobs or other PRs.
@Icohedron
Copy link
Contributor

Please update your branch so you don't accidentally stall one of the CI machines (see #970 for details)

Copy link
Contributor

@Keenuts Keenuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to spend more time on the ballot and generation implem. Initial comments to begin with.

Err->release();
return 0;
}
uint32_t SubgroupSize = PSO->threadExecutionWidth();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect for Metal: When you create a compute pipeline state, it calculates the maximum number of threads available on the device. This value never changes, but may be different for different pipeline objects. ( https://developer.apple.com/documentation/metal/mtlcomputepipelinestate/maxtotalthreadsperthreadgroup )
It's not on the threadExecutionWidth property directly, but on the associated maxTotalThreadsPerThreadgroup, and explains why it's on the pipeline state instead of a device state (register pressure might impact it). Hence we can deduce the threadExecutionWidth is also not guaranteed to be stable across pipelines.

This should be added to the Device.h getMinMaxSubgroupSize() function: say that this in an hint for Metal at least.

VkBufferUsageFlags Usage,
VkMemoryPropertyFlags MemoryFlags,
size_t Size, void *Data = nullptr) {
const VkDeviceSize AllocationSize = std::max<size_t>(Size, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we calling createBuffer with a size == 0?
Shouldn't this be an error instead? (like calling malloc(0) is right)

VkBufferUsageFlags Usage,
VkMemoryPropertyFlags MemoryFlags,
size_t Size, void *Data = nullptr) {
const VkDeviceSize AllocationSize = std::max<size_t>(Size, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we calling createBuffer with a size == 0?
Shouldn't this be an error instead? (like calling malloc(0) is right)

if (Data && Size > 0) {
void *Dst = nullptr;
if (vkMapMemory(IS.Device, Memory, 0, VK_WHOLE_SIZE, 0, &Dst))
if (vkMapMemory(IS.Device, Memory, 0, AllocationSize, 0, &Dst))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not sticking with VK_WHOLE_SIZE?

}

ResourceBundle Bundle{getDescriptorType(R.Kind), R.size(), R.BufferPtr};
const size_t LogicalSize = R.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like if the buffer size is zero, we should fail earlier or ignore it:

  • size() should be computed when we set the buffer data (Data: or FillSize:)
    Hence the size should at least be large enough to hold the getElementSize, hence this change could be reverted.

Comment on lines +124 to +154
if (MaxNestingLevel < 2) {
std::cerr << "Error: MaxNestingLevel must be >= 2" << std::endl;
return 1;
}

std::vector<uint32_t> Counts;
std::stringstream CountsStream(CountsPerLevel);
std::string CountItem;
while (std::getline(CountsStream, CountItem, ',')) {
Counts.push_back(std::stoi(CountItem));
}

// Index by nesting level. Levels 0 and 1 are intentionally unused.
std::vector<uint32_t> TestsCountPerLevel(MaxNestingLevel + 1, 0);
if (Counts.size() == 1) {
for (uint32_t Level = 2; Level <= MaxNestingLevel; ++Level) {
TestsCountPerLevel[Level] = Counts[0];
}
} else {
// Expect one count per level from 2 to MaxNestingLevel
const size_t Expected = MaxNestingLevel - 1;
if (Counts.size() != Expected) {
std::cerr << "Error: Expected " << Expected << " counts (for levels 2.."
<< MaxNestingLevel << "), got " << Counts.size() << std::endl;
return 1;
}
for (size_t CountIndex = 0; CountIndex < Counts.size(); ++CountIndex) {
TestsCountPerLevel[static_cast<uint32_t>(CountIndex) + 2] =
Counts[CountIndex];
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (MaxNestingLevel < 2) {
std::cerr << "Error: MaxNestingLevel must be >= 2" << std::endl;
return 1;
}
std::vector<uint32_t> Counts;
std::stringstream CountsStream(CountsPerLevel);
std::string CountItem;
while (std::getline(CountsStream, CountItem, ',')) {
Counts.push_back(std::stoi(CountItem));
}
// Index by nesting level. Levels 0 and 1 are intentionally unused.
std::vector<uint32_t> TestsCountPerLevel(MaxNestingLevel + 1, 0);
if (Counts.size() == 1) {
for (uint32_t Level = 2; Level <= MaxNestingLevel; ++Level) {
TestsCountPerLevel[Level] = Counts[0];
}
} else {
// Expect one count per level from 2 to MaxNestingLevel
const size_t Expected = MaxNestingLevel - 1;
if (Counts.size() != Expected) {
std::cerr << "Error: Expected " << Expected << " counts (for levels 2.."
<< MaxNestingLevel << "), got " << Counts.size() << std::endl;
return 1;
}
for (size_t CountIndex = 0; CountIndex < Counts.size(); ++CountIndex) {
TestsCountPerLevel[static_cast<uint32_t>(CountIndex) + 2] =
Counts[CountIndex];
}
}
std::vector<uint32_t> Counts;
{
std::stringstream CountsStream(CountsPerLevel);
std::string CountItem;
while (std::getline(CountsStream, CountItem, ',')) {
Counts.push_back(std::stoi(CountItem));
}
}
if (MaxNestingLevel < 2) {
std::cerr << "Error: MaxNestingLevel must be >= 2" << std::endl;
return 1;
}
if (Counts.size() != 1 || Counts.size() != MaxNestingLevel - 1) {
std::cerr << "Error: 1 or max_nesting_level - 1 counts must be given.";
return 1;
}
assert(MaxNestingLevel >= 2);
assert(Counts.size() == 1 || Counts.size() == MaxNestingLevel - 1);
std::vector<uint32_t> TestsCountPerLevel;
if (Counts.size() == 1)
TestsCountPerLevel.assign(MaxNestingLevel - 1, 0);
else
TestsCountPerLevel.assign(Counts.begin(), Counts.end());
// Level 0 and 1 are set to 0 because X Y Z
TestsCountPerLevel.insert(0, 0);
TestsCountPerLevel.insert(0, 0);

Comment on lines +174 to +175
/*ThreadgroupSizeX=*/7,
/*ThreadgroupSizeY=*/13);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are those values?


typedef std::bitset<128> bitset_inv_t;

inline Ballot waveSizeToMask(uint32_t waveSize, uint32_t /*waveCount*/) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unused param

}

protected:
uint32_t m_bits;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What it this?

reconvergence::ReconvergenceTestGenerator TestGenerator(OutputDir,
TestExpectations);

// Wave size must be less than or equal to 128 with current ballot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your bitset to ballot implem works on a uint64, meaning max wave size is 64, otherwise we might have issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants