Revert "[PD] prepare request in prefill instance by multi threads" by Jiang-Jia-Jun · Pull Request #7821 · PaddlePaddle/FastDeploy

Jiang-Jia-Jun · 2026-05-14T11:56:26Z

Reverts #7724

This reverts commit 4e7a46e.

paddle-bot · 2026-05-14T11:56:32Z

Thanks for your contribution!

Copilot

Pull request overview

本 PR 回滚了 #7724（"prepare request in prefill instance by multi threads"），将相关功能恢复到此前实现：移除 EngineServicePrepareMixin 抽象、删除 FD_PREFILL_PREPARE_REQ_THREAD_NUM 环境变量，并恢复使用 ThreadPoolExecutor(max_workers=1) 的单线程 _fetch_request 路径。同时恢复 engine_worker_queue 中 finished_add_cache_task 系列队列/锁/flag 与 cache_messager 中 put_finished_add_cache_task_req、_maybe_wait_for_cache_task 移除等行为，并同步更新单元测试。

Changes:

删除 fastdeploy/engine/common_engine_prepare_mixin.py，将单线程 fetch 流程内联回 EngineService._schedule_request_to_worker_v1。
在 engine_worker_queue.py 中重新引入 finished_add_cache_task_* 相关共享资源和 put/get_finished_add_cache_task_req 接口；cache_messager 恢复以 barrier + 队列方式上报已添加的 cache task。
移除 FD_PREFILL_PREPARE_REQ_THREAD_NUM 环境变量，并同步调整 tests/engine/test_common_engine.py、tests/inter_communicator/test_e2w_queue.py、tests/cache_manager/test_cache_messager.py。

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
fastdeploy/engine/common_engine.py	移除 mixin 继承，将 `_fetch_request` 内联回 `_schedule_request_to_worker_v1`，恢复单 `ThreadPoolExecutor` 方式
fastdeploy/engine/common_engine_prepare_mixin.py	整体删除该文件（多线程 prepare 实现回滚）
fastdeploy/envs.py	删除 `FD_PREFILL_PREPARE_REQ_THREAD_NUM` 环境变量
fastdeploy/inter_communicator/engine_worker_queue.py	恢复 finished_add_cache_task 队列/锁/flag/barrier 及 put/get 接口
fastdeploy/cache_manager/cache_messager.py	恢复在 add_cache_task 后通过 `put_finished_add_cache_task_req` 通知，并删除 `_maybe_wait_for_cache_task`
tests/engine/test_common_engine.py	测试恢复对内联 `_schedule_request_to_worker_v1` 与 ThreadPoolExecutor 的 patch 路径
tests/inter_communicator/test_e2w_queue.py	增加 `finished_add_cache_task` 队列流程的测试用例
tests/cache_manager/test_cache_messager.py	移除按文件路径动态导入逻辑，断言新增 finished_add_cache_task 上报行为

PaddlePaddle-bot · 2026-05-14T12:09:00Z

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-05-14 20:07:08

CI报告基于以下代码生成（30分钟更新一次）:

PR commit: cb68177
Merge base: 18cab83 (branch: release/2.6)
查看完整 Diff
CI 详情

1 任务总览

⚠️ 1 个 required 任务失败，6 个 required 任务运行中，请优先处理 required 失败任务。

总执行（rerun次数）	总任务	✅ 通过	❌ 失败	⏳ 运行中	⏸️ 等待中	跳过
34(0)	34	23	2	8	1	0

2 任务状态汇总

2.1 Required任务 : 1/8 通过

必选任务阻塞合并，失败需优先处理。

状态	任务	耗时	根因	修复建议	日志	重跑
❌	`Approval`	9s	PR问题：修改envs.py缺少RD审批，未满足Cherry-Pick规范	请RD（jiangjiajun等）Approve，确认Cherry-Pick规范	Job	-
⏳	`Extracted partial CE model tasks / run_ce_cases`	-	运行中	-	Job	-
⏳	`Run Base Tests / base_tests`	-	运行中	-	Job	-
⏳	`Run FastDeploy LogProb Tests / run_tests_logprob`	-	运行中	-	Job	-
⏳	`Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage`	-	运行中	-	Job	-
⏳	`Run Four Cards Tests / run_4_cards_tests`	-	运行中	-	Job	-
⏳	`Run Stable Tests / stable_tests`	-	运行中	-	Job	-
✅	其余 1 个必选任务通过	-	-	-	-	-

2.2 可选任务 — 22/26 通过

可选任务不阻塞合并，失败仅供参考。

状态	任务	耗时	日志	重跑
❌	`Check PR Template`	16s	Job	-
⏳	`xpu_build_test / xpu-build-test`	-	Job	-
⏳	`Run iluvatar Tests / run_iluvatar_cases`	-	Job	-
⏸️	`CI_HPU`	-	-	-
✅	其余 22 个可选任务通过	-	-	-

3 失败详情（仅 required）

Approval — 代码规范/审批检查（置信度: 高）

Approval

状态: ❌ 失败
错误类型: 代码规范（PR审批检查）
置信度: 高
根因摘要: PR修改了envs.py但缺少RD审批，且未满足Cherry-Pick规范
分析器: 通用分析(fallback)

根因详情:
审批检查脚本 scripts/check_approval.sh 报告了2个审批错误：①PR修改了 fastdeploy/envs.py，需要FastDeploy RD（jiangjiajun/liuyuanle/chenjian26/wanglongzhi）中至少一人进行Review并Approve；②提交到 release/2.6 分支的PR涉及Cherry-Pick规范，标题须包含 [Cherry-Pick] 字样及 develop 原始PR号，并需指定RD审批。本PR为Revert类型，请与相关RD确认是否适用该规则。

关键日志:

0. You must have one FastDeploy RD (Jiang-Jia-Jun(jiangjiajun), yuanlehome(liuyuanle),
   rainyfly(chenjian26), Wanglongzhi2001(wanglongzhi)) approval for modifying [fastdeploy/envs.py].
1. Cherry-Pick PR must come from develop and the title must contain [Cherry-Pick]
   and the original develop PR number. Approval required from: qingqing01, jiangjiajun, dengkaipeng.
There are 2 approved errors.
##[error]Process completed with exit code 6.

修复建议:

请FastDeploy RD成员（jiangjiajun / liuyuanle / chenjian26 / wanglongzhi 之一）对PR进行Review并点击 Approve
与RD确认此Revert PR是否需满足Cherry-Pick规范；若适用，需在标题中补充 [Cherry-Pick] 及对应develop PR号

修复建议摘要: 请RD成员（jiangjiajun等）Approve，确认Cherry-Pick规范

关联变更: Revert PR中包含 fastdeploy/envs.py 的变更，触发了RD审批规则
链接: 查看日志

PaddlePaddle-bot

🤖 Paddle-CI-Agent | pr_review | 2026-05-14 20:17:31

📋 Review 摘要

PR 概述：回滚 PR #7724（PD 分离场景下 prefill 实例多线程准备请求），改为在 _schedule_request_to_worker_v1 内用 ThreadPoolExecutor(max_workers=1) 内联方式执行，并以 barrier 替代原有的自旋等待。
变更范围：fastdeploy/engine/、fastdeploy/cache_manager/、fastdeploy/inter_communicator/
影响面 Tag：[PD Disaggregation] [Engine] [KVCache]

📝 PR 规范检查

PR 标题缺少官方 Tag，且目标分支为 release/2.6 应使用 Cherry-Pick 格式；PR 描述仅有一行 "Reverts #7724"，缺少所有必填 section。

标题建议（可直接复制）：

[Cherry-Pick][PD Disaggregation] Revert "prepare request in prefill instance by multi threads"(#7724)

PR 描述建议（可直接复制，必须复刻 checklist §D2 模板的完整结构）：

## Motivation
Revert PR #7724 which introduced multi-threaded request preparation in the prefill instance. The original multi-thread approach caused stability issues that require rollback. This PR reverts to a `ThreadPoolExecutor(max_workers=1)` design while retaining barrier-based synchronization between the cache messager and engine.

## Modifications
- `fastdeploy/engine/common_engine.py`: Remove separate `_prepare_request_v1` thread; merge fetch-request logic inline into `_schedule_request_to_worker_v1` using `ThreadPoolExecutor(max_workers=1)`; remove `EngineServicePrepareMixin` dependency
- `fastdeploy/engine/common_engine_prepare_mixin.py`: Delete entire mixin file (282 lines)
- `fastdeploy/cache_manager/cache_messager.py`: Remove `_maybe_wait_for_cache_task` spin-wait; replace with `finish_add_cache_task_barrier`-based synchronization
- `fastdeploy/inter_communicator/engine_worker_queue.py`: Add `finished_add_cache_task_list`, related barriers/locks/flags; add `put_finished_add_cache_task_req` / `get_finished_add_cache_task_req` methods
- `fastdeploy/envs.py`: Remove `FD_PREFILL_PREPARE_REQ_THREAD_NUM` env variable
- Tests updated accordingly

## Usage or Command
N/A

## Accuracy Tests
N/A

## Checklist

- [x] Add at least a tag in the PR title.
  - Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
  - You can add new tags based on the PR content, but the semantics must be clear.
- [ ] Format your code, run `pre-commit` before commit.
- [ ] Add unit tests. Please write the reason in this PR if no unit tests.
- [ ] Provide accuracy results.
- [x] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.

问题

级别	文件	概述
📝 PR 规范	PR 标题 / 描述	标题缺少 Tag，描述缺少所有必填 section
❓ 疑问	`engine_worker_queue.py:759`	`put_finished_add_cache_task_req` 返回类型注解 `-> None` 但实际返回 `bool`
❓ 疑问	`engine_worker_queue.py:779`	`get_finished_add_cache_task_req` 返回类型注解 `-> str` 但实际返回 `list`
🟡 建议	`common_engine.py:1008`	`need_check_req_ids` 在 L1005 赋值后立刻被 L1008 覆盖，存在冗余赋值

总体评价

逻辑结构清晰，用 barrier 替换自旋等待的思路合理，测试也做了相应更新。除 PR 规范问题外，发现两处类型注解错误和一处冗余赋值，建议修复后合入。

PaddlePaddle-bot · 2026-05-14T12:20:25Z

        self.finish_send_cache_lock.release()
        return response

+    def put_finished_add_cache_task_req(self, req_ids) -> None:


❓ 疑问 返回类型注解 -> None 与实际返回值不符。

该方法末尾 return all_client_put，实际返回 bool，但注解写的是 -> None，docstring 也未描述返回值。

建议修复：

def put_finished_add_cache_task_req(self, req_ids) -> bool: """ Put finished request ID into the queue. Args: req_ids: Request ID to be added to the queue Returns: bool: Whether all clients have put their finished requests """

PaddlePaddle-bot · 2026-05-14T12:20:25Z

+        self.finish_add_cache_task_lock.release()
+        return all_client_put
+
+    def get_finished_add_cache_task_req(self) -> str:


❓ 疑问 返回类型注解 -> str 与实际返回值不符。

该方法实际返回 list（初始化为 response = []，从列表中获取后也是列表类型），docstring 描述的也是 str: Finished request ID，三者不一致。

建议修复：

def get_finished_add_cache_task_req(self) -> list: """ Get finished request IDs from the queue. Returns: list: List of finished request IDs """

PaddlePaddle-bot · 2026-05-14T12:20:25Z

+                        need_check_req_ids = [task.request_id for task in tasks]
+                        self.split_connector.send_cache_info_to_messager(tasks, 0)
+                        # ensure cache tasks has sent to cache_messager
+                        need_check_req_ids = [task.request_id for task in tasks]


🟡 建议 need_check_req_ids 被连续赋值两次，第一次（L1005）赋值后立刻被第二次（L1008）覆盖，第一次赋值无意义。

# L1005: 此行赋值被 L1008 立刻覆盖，属于冗余代码 need_check_req_ids = [task.request_id for task in tasks] self.split_connector.send_cache_info_to_messager(tasks, 0) # ensure cache tasks has sent to cache_messager need_check_req_ids = [task.request_id for task in tasks] # <-- 实际生效的赋值

建议删除 L1005 的赋值，仅保留 L1008 处的赋值。

Revert "prepare request in prefill instance by multi threads (#7724)"

cb68177

This reverts commit 4e7a46e.

Copilot AI review requested due to automatic review settings May 14, 2026 11:56

Jiang-Jia-Jun had a problem deploying to Metax_ci May 14, 2026 11:56 — with GitHub Actions Failure

Copilot started reviewing on behalf of Jiang-Jia-Jun May 14, 2026 11:56 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

PaddlePaddle-bot reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "[PD] prepare request in prefill instance by multi threads"#7821

Revert "[PD] prepare request in prefill instance by multi threads"#7821
Jiang-Jia-Jun wants to merge 1 commit into
release/2.6from
revert-7724-refine-pd-fetch-req

Jiang-Jia-Jun commented May 14, 2026

Uh oh!

paddle-bot Bot commented May 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

PaddlePaddle-bot commented May 14, 2026

Approval

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot May 14, 2026

Uh oh!

PaddlePaddle-bot May 14, 2026

Uh oh!

PaddlePaddle-bot May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Jiang-Jia-Jun commented May 14, 2026

Uh oh!

paddle-bot Bot commented May 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

PaddlePaddle-bot commented May 14, 2026

1 任务总览

2 任务状态汇总

2.1 Required任务 : 1/8 通过

2.2 可选任务 — 22/26 通过

3 失败详情（仅 required）

Approval

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants