-
Notifications
You must be signed in to change notification settings - Fork 665
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Core] Fix MXFP8 grouped quantize for zero-sized groups in update_tma_descriptors
2.14.0
bug
Something isn't working
#2782
opened Mar 18, 2026 by
jberchtold-nvidia
Loading…
8 of 13 tasks
[PyT] Install pytest in onnx L1 test as Pyt container no longer packages it
2.14.0
#2781
opened Mar 18, 2026 by
KshitijLakhani
Loading…
4 of 13 tasks
Fused Adam Support for MXFP8 + FSDP2 integration
#2780
opened Mar 18, 2026 by
vthumbe1503
Loading…
13 tasks
Enable fused RMSNorm dLN + add through CUDNN
#2778
opened Mar 18, 2026 by
CarlosGomes98
Loading…
1 of 13 tasks
[fused_router][pytorch] Optimize naive topk path and add perf benchmark
#2776
opened Mar 18, 2026 by
XiaomingFun233
Loading…
add blackwell support filter for 9.7<=cudnn<9.18.1
2.14.0
#2775
opened Mar 17, 2026 by
sudhakarsingh27
Loading…
13 tasks
[PyTorch] Add an API restore from function context to ensure tensors are detached
#2772
opened Mar 17, 2026 by
kainzhong
Loading…
7 of 13 tasks
add mark_not_offload() interface for cpu_offload_v1
#2770
opened Mar 17, 2026 by
lhb8125
Loading…
13 tasks
GEMM + Swiglu fused Grouped MLP for MXFP8
2.14.0
MoE
#2769
opened Mar 17, 2026 by
ksivaman
Loading…
13 tasks
[Draft]Support for score_mod and score_mod_bprop in cuDNN's sdpa
#2767
opened Mar 16, 2026 by
vcherepanov-nv
Loading…
2 of 13 tasks
[PyTorch] transformer_engine.pytorch.autocast suport inside torch.compile
#2759
opened Mar 13, 2026 by
pggPL
Loading…
4 of 26 tasks
[JAX] Grouped GEMM Refactor to use first_dims and last_dims
#2749
opened Mar 10, 2026 by
jberchtold-nvidia
Loading…
1 of 13 tasks
[Common] Persistent Grouped NVFP4 quantization kernel
#2743
opened Mar 6, 2026 by
Oleg-Goncharov
•
Draft
8 of 13 tasks
[Common] Persistent Grouped MXFP8 quantization kernel
enhancement
New feature or request
MoE
#2738
opened Mar 5, 2026 by
Oleg-Goncharov
Loading…
9 of 13 tasks
Feat/cp nvshmem enhanced
community-contribution
PRs from external contributor outside the core maintainers, representing community-driven work.
#2737
opened Mar 5, 2026 by
Knight-of-Thunder
Loading…
1 of 13 tasks
Feature/unswizzle
community-contribution
PRs from external contributor outside the core maintainers, representing community-driven work.
#2732
opened Mar 4, 2026 by
int-smart
Loading…
9 of 13 tasks
fix: scope get_full_cu_seqlens cache key by device and inference mode
#2728
opened Mar 3, 2026 by
DmCarpe93
Loading…
8 of 13 tasks
[Common, pyTorch] Grouped MXFP8 dequantize support
#2722
opened Mar 2, 2026 by
ptrendx
Loading…
1 of 13 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.