Skip to content

HipIGBADetector: use UniformConstant for __chip_module_has_no_IGBAs#1149

Closed
pvelesko wants to merge 1 commit intomainfrom
cherry-igba-uniformconstant
Closed

HipIGBADetector: use UniformConstant for __chip_module_has_no_IGBAs#1149
pvelesko wants to merge 1 commit intomainfrom
cherry-igba-uniformconstant

Conversation

@pvelesko
Copy link
Copy Markdown
Collaborator

Cherry-pick of 0bc8d63204afb3ec02a6d9d3f985913ef06c5995 onto main.

Original commit: HipIGBADetector: use UniformConstant for __chip_module_has_no_IGBAs

Initialize in pass, immutable at runtime (runtime only reads it).
CrossWorkgroup is for mutable device memory.
Comment on lines +123 to +126
// UniformConstant is correct: we initialize it in this pass, and at
// runtime it is immutable (the runtime only reads it). CrossWorkgroup
// is for mutable device memory.
SPIRV_UNIFORMCONSTANT_AS);
Copy link
Copy Markdown
Collaborator

@linehill linehill Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the runtime only reads it" statement is ambiguous: Does it mean the host driver reads it? And/or the kernel? The statement is true for the former but not for the latter.

Switching storage class from CrossWorkgroup to UniformConstant is concerning here because the space for the latter can be scarce (e.g. 64kb on RTX 3080 through OpenCL). This can have increased change for launch failures for users already consuming most of the constant memory.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the change here is motivated by an issue mentioned here:

CrossWorkgroup variables with initializers may be optimized or lowered incorrectly in the in-tree SPIR-V backend, producing invalid forward references.

Please, open a bug report on llvm-project and add link to the comment.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a single variable placed in const memory is const mem space really a concern here? Reading from const memory should be faster and the space required is negligible, no? @linehill

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a single variable placed in const memory is const mem space really a concern here? Reading from const memory should be faster ...

It won't be faster in a way it matters. The variable is not read by kernels so there is no benefit placing it in constant memory. It's only read by SPIR-V parser from the SPIR-V bitcode once per module.

... and the space required is negligible, no? @linehill

I'd agree if it is placed in CrossWorkgroup but not if placed in constant memory - 64kb in the RTX3080 is not much even if the variable is fraction of it. You are not taking account an user might use up all the available scarce constant memory space which works fine until this change here. That said, it might not matter if the kernel compiler is smart enough to eliminate the variable out since it is not read by any kernel (or, at least, is not supposed to).

@pvelesko
Copy link
Copy Markdown
Collaborator Author

You are not taking account an user might use up all the available scarce constant memory space which works fine until this change here

Okay yeah that makes sense. I'll double check if there are issues with crossworkgroup + llvm 22 + native

@pvelesko pvelesko closed this Feb 19, 2026
@pvelesko pvelesko deleted the cherry-igba-uniformconstant branch February 19, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants