Performance Improvements for MI300X with GEMM and FP8 Enhancements#811
Open
chunfangamd wants to merge 3 commits intomainfrom
Open
Performance Improvements for MI300X with GEMM and FP8 Enhancements#811chunfangamd wants to merge 3 commits intomainfrom
chunfangamd wants to merge 3 commits intomainfrom
Conversation
- Pin aiter and sgl-kernel to specific commits required by the
v0.5.8-rocm700-mi30x image.
- This Patch should only work with Image
lmsysorg/sglang:v0.5.8-rocm700-mi30x
- A work with Zhentao Chen
The previous aiter ref (9046b6f) changed get_mla_metadata_v1 to expect a Tensor for kv_last_page_lens, but the image's sglang still passed an int, crashing during cuda graph capture. Fix by fresh-cloning aiter at d2ca5a89, pinning sgl-kernel to 8bd6447 (now at sglang/sgl-kernel), and uninstalling stale packages before rebuilding to avoid leftover C extension conflicts.
functionstackx
requested changes
Feb 26, 2026
Contributor
functionstackx
left a comment
There was a problem hiding this comment.
Can y'all update sglang with the latest aiter and we wait until next sgl release image before updating inferenceMax? Ideally we wanna track acutal images and not patchwork
Collaborator
|
@chunfangamd So the idea is that this will land in SGLang 0.10.0? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
By patching aiter/sgl-kernel versions for MI300X FP8 DSR1 SGLang, we include three improvements:
e2e Tests: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/22400655865