Support PaddlePaddle with compatible API and tvm-ffi by SigureMo · Pull Request #2 · PFCCLab/flashinfer

SigureMo · 2025-10-02T07:18:52Z

A new approach to replace #1. This PR need PaddlePaddle/Paddle#75651 and PaddlePaddle/Paddle#75650.

…tention pd api dismatch

## 📌 Description To fix the following bug: When the CuteDSL MoE kernels were ported from TensorRT-LLM to FlashInfer, the mPtrPermutedIdxToExpandedIdx field was accidentally dropped from the routing kernel's DataBase struct in RoutingKernel.h. TRT-LLM's routing kernel produces three reverse-mapping outputs: 1. mPtrExpandedIdxToPermutedIdx[expandedIdx] = permutedIdx — forward mapping 2. mPtrPermutedIdxToExpandedIdx[permutedIdx] = expandedIdx — reverse to expanded index (token_idx * topk + k) 3. mPtrPermutedIdxToTokenIdx[permutedIdx] = tokenIdx — reverse to token index only FlashInfer's port kept only #1 and #3, dropping #2. The binding in moe_utils_binding.cu then had to wire the Python buffer permuted_idx_to_expanded_idx to the only available reverse-mapping field — mPtrPermutedIdxToTokenIdx — which writes plain tokenIdx instead of expandedIdx. The Impact The CuteDSL kernels (GEMM1 gather, moe_output_memset, GEMM2 finalize) all expect expanded indices and derive the token index via expanded_idx // topk. When they received plain tokenIdx instead, they computed tokenIdx // topk — yielding the wrong A row for gather, wrong zero-init for memset, and wrong scatter position + wrong routing scale for finalize.  ## 🔍 Related Issues  ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [ ] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [ ] I have installed the hooks with `pre-commit install`. - [ ] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [ ] Tests have been added or updated as needed. - [ ] All tests are passing (`unittest`, etc.). ## Reviewer Notes   ## Summary by CodeRabbit * **Refactor** * Refined MOE (Mixture of Experts) routing infrastructure by extending index mapping capabilities across multiple kernel implementations to improve internal data flow consistency. * **Tests** * Strengthened accuracy validation thresholds from 0.925 to 0.97 with adjusted error tolerance parameters, ensuring more rigorous testing of MOE operations under FP4 quantization conditions.

SigureMo marked this pull request as draft October 2, 2025 07:19

SigureMo commented Oct 2, 2025

View reviewed changes

Comment thread flashinfer/fused_moe/core.py Outdated

SigureMo mentioned this pull request Oct 2, 2025

Support PaddlePaddle with compatible API (legacy) #1

Closed

Support PaddlePaddle with compatible API and tvm-ffi

6299602

SigureMo force-pushed the support-paddlepaddle-with-compatible-api-and-tvmffi branch from fd12761 to 6299602 Compare October 13, 2025 02:32

SigureMo added 5 commits October 13, 2025 02:36

remove torch from requirements.txt

eb73e79

remove changes about import tvm_ffi

a038d38

Support PaddlePaddle with compatible API and tvm-ffi

fbefce5

remove torch from requirements.txt

95f1bf5

remove changes about import tvm_ffi

955aedf

SigureMo force-pushed the support-paddlepaddle-with-compatible-api-and-tvmffi branch from a038d38 to 955aedf Compare October 24, 2025 11:47

SigureMo and others added 4 commits November 4, 2025 11:17

remove dtype conversion

7476b5a

1.fix fp4gemm pd api dismatch 2.fix cudagraph error of nvfp4 3.fix at…

71fce0a

…tention pd api dismatch

fix conflict

63716a2

fix cudagraph

93eeb2f

SigureMo closed this Dec 13, 2025

SigureMo deleted the support-paddlepaddle-with-compatible-api-and-tvmffi branch December 13, 2025 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support PaddlePaddle with compatible API and tvm-ffi#2

Support PaddlePaddle with compatible API and tvm-ffi#2
SigureMo wants to merge 10 commits into
mainfrom
support-paddlepaddle-with-compatible-api-and-tvmffi

SigureMo commented Oct 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SigureMo commented Oct 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants