Skip to content

fix: resolve face enrollment freeze issue#45

Merged
deepin-bot[bot] merged 1 commit into
linuxdeepin:masterfrom
fly602:master
May 14, 2026
Merged

fix: resolve face enrollment freeze issue#45
deepin-bot[bot] merged 1 commit into
linuxdeepin:masterfrom
fly602:master

Conversation

@fly602
Copy link
Copy Markdown
Contributor

@fly602 fly602 commented May 14, 2026

  1. Set m_stopCapture flag to true before invoking Stop method to signal capture loop exit
  2. Convert m_stopCapture from bool to std::atomic for thread-safe cross- thread access
  3. Add retry logic for socket write operations to handle EAGAIN/ EWOULDBLOCK errors
  4. Implement maximum 100 retries with 1ms delay to prevent infinite loop in sendCapture

Influence:

  1. Test face enrollment stop operation to verify no freeze occurs
  2. Test concurrent stop request during active image capture
  3. Verify socket write handles temporary unavailability gracefully
  4. Test retry mechanism reaches maximum limit and breaks loop correctly
  5. Verify normal face enrollment completion still works correctly
  6. Test multiple rapid start/stop enrollment cycles

fix: 解决录入人脸时卡死的问题

  1. 在调用 Stop 方法前设置 m_stopCapture 标志为 true 以通知捕获循环退出
  2. 将 m_stopCapture 从 bool 转换为 std::atomic 实现跨线程安全访问
  3. 为 socket 写操作添加重试逻辑以处理 EAGAIN/EWOULDBLOCK 错误
  4. 实现最大 100 次重试和 1ms 延迟以防止 sendCapture 中出现死循环

Influence:

  1. 测试人脸录入停止操作验证不再发生卡死
  2. 测试活动图像捕获期间的并发停止请求
  3. 验证 socket 写操作能优雅处理临时不可用情况
  4. 测试重试机制达到最大限制时正确跳出循环
  5. 验证正常人脸录入完成流程仍然正常工作
  6. 测试多次快速开始/停止录入循环

PMS: BUG-352653
Change-Id: Ice3f9e91fc29f36aa4315da9d4b6b049d6b190fe

Summary by Sourcery

Fix face enrollment freezes by improving capture stop signaling and making socket writes more robust against transient errors.

Bug Fixes:

  • Prevent capture loop from hanging by setting a thread-safe stop flag before stopping the enrollment thread.
  • Avoid potential infinite loops on socket write by adding bounded retry logic with a short delay for temporary EAGAIN/EWOULDBLOCK errors.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 14, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

This PR fixes a face enrollment freeze by making the capture stop flag thread-safe, setting it before the blocking stop call, and hardening the socket send loop with bounded retry handling for transient EAGAIN/EWOULDBLOCK errors.

Sequence diagram for face enrollment stop and capture loop interaction

sequenceDiagram
    participant DriverManger
    participant ErollThread
    participant Socket

    DriverManger->>ErollThread: enrollStop(actionId, errMsgInfo)
    DriverManger->>ErollThread: m_stopCapture = true
    DriverManger->>ErollThread: Stop() (Qt::BlockingQueuedConnection)

    loop sendCapture loop
        ErollThread->>Socket: write(m_fileSocket, buf, countSize)
        alt sendSize < 0 and errno is EAGAIN or EWOULDBLOCK
            ErollThread->>ErollThread: retryCount++
            alt retryCount > 100
                ErollThread->>ErollThread: break loop
            else retry limit not reached
                ErollThread->>ErollThread: usleep(1000)
            end
        else sendSize >= 0
            ErollThread->>ErollThread: retryCount = 0
            ErollThread->>ErollThread: countSize -= sendSize
        end
        alt m_stopCapture is true
            ErollThread->>ErollThread: exit capture loop
        end
    end
Loading

File-Level Changes

Change Details Files
Harden send loop to avoid infinite blocking on socket write when the peer is temporarily unavailable.
  • Introduce a retry counter for socket writes in the capture sending loop
  • On EAGAIN/EWOULDBLOCK, increment retry counter, sleep 1ms, and abort after more than 100 consecutive retries
  • Reset retry counter after any successful send before updating remaining byte count
workmodule.cpp
Make the capture stop flag thread-safe and ensure it is set before invoking the blocking stop operation.
  • Convert the capture stop flag from a plain bool to std::atomic and move it into the public section of the enrollment thread class for cross-thread access
  • Set the atomic stop flag from DriverManger::enrollStop before invoking the Stop slot via a blocking queued connection
workmodule.h
drivermanger.cpp
Update SPDX copyright headers to reflect an extended year range.
  • Change SPDX-FileCopyrightText from 2022 to 2022 - 2026 in modified source files
workmodule.h
drivermanger.cpp

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The m_stopCapture flag is now a public std::atomic<bool>; consider keeping it private and exposing a small setter/getter (or a dedicated requestStop() method) to avoid external code directly mutating thread-control state.
  • In sendCapture, non-EAGAIN/EWOULDBLOCK errors from write still result in a tight loop with no backoff or break condition; consider breaking out or logging on other errors to avoid potential busy-waiting on persistent failures.
  • When the retry count exceeds 100 in sendCapture, the loop breaks silently; consider at least logging this condition or propagating an error so callers can distinguish between a normal stop and a send failure.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `m_stopCapture` flag is now a public `std::atomic<bool>`; consider keeping it private and exposing a small setter/getter (or a dedicated `requestStop()` method) to avoid external code directly mutating thread-control state.
- In `sendCapture`, non-EAGAIN/EWOULDBLOCK errors from `write` still result in a tight loop with no backoff or break condition; consider breaking out or logging on other errors to avoid potential busy-waiting on persistent failures.
- When the retry count exceeds 100 in `sendCapture`, the loop breaks silently; consider at least logging this condition or propagating an error so callers can distinguish between a normal stop and a send failure.

## Individual Comments

### Comment 1
<location path="workmodule.cpp" line_range="117-123" />
<code_context>
+    int retryCount = 0;
     while (countSize > 0 && !m_stopCapture) {
         long sendSize = write(m_fileSocket, &buf[size - countSize], static_cast<size_t>(countSize));
         if (sendSize < 0) {
+            if (errno == EAGAIN || errno == EWOULDBLOCK) {
+                retryCount++;
+                if (retryCount > 100) {
+                    break;
+                }
+                usleep(1000);
+            }
             continue;
</code_context>
<issue_to_address>
**issue (bug_risk):** Non-EAGAIN/EWOULDBLOCK errors from `write` are ignored and may cause a busy loop.

If `write` returns `< 0` with an error other than `EAGAIN`/`EWOULDBLOCK`, we still hit the unconditional `continue` and immediately retry, so permanent errors like `EPIPE`, `ECONNRESET`, or `EBADF` will cause a tight loop until `m_stopCapture` changes. Consider detecting non-retryable errors, breaking out of the loop (and optionally logging) instead of repeatedly calling a failing `write`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread workmodule.cpp Outdated
1. Set m_stopCapture flag to true before invoking Stop method to signal
capture loop exit
2. Convert m_stopCapture from bool to std::atomic for thread-safe cross-
thread access
3. Add retry logic for socket write operations to handle EAGAIN/
EWOULDBLOCK errors
4. Implement maximum 100 retries with 1ms delay to prevent infinite loop
in sendCapture

Influence:
1. Test face enrollment stop operation to verify no freeze occurs
2. Test concurrent stop request during active image capture
3. Verify socket write handles temporary unavailability gracefully
4. Test retry mechanism reaches maximum limit and breaks loop correctly
5. Verify normal face enrollment completion still works correctly
6. Test multiple rapid start/stop enrollment cycles

fix: 解决录入人脸时卡死的问题

1. 在调用 Stop 方法前设置 m_stopCapture 标志为 true 以通知捕获循环退出
2. 将 m_stopCapture 从 bool 转换为 std::atomic 实现跨线程安全访问
3. 为 socket 写操作添加重试逻辑以处理 EAGAIN/EWOULDBLOCK 错误
4. 实现最大 100 次重试和 1ms 延迟以防止 sendCapture 中出现死循环

Influence:
1. 测试人脸录入停止操作验证不再发生卡死
2. 测试活动图像捕获期间的并发停止请求
3. 验证 socket 写操作能优雅处理临时不可用情况
4. 测试重试机制达到最大限制时正确跳出循环
5. 验证正常人脸录入完成流程仍然正常工作
6. 测试多次快速开始/停止录入循环

PMS: BUG-352653
Change-Id: Ice3f9e91fc29f36aa4315da9d4b6b049d6b190fe
@deepin-ci-robot
Copy link
Copy Markdown

deepin pr auto review

你好!我是CodeGeeX,你的智能编程助手。我已经仔细审查了你提供的 git diff

本次代码变更主要解决了网络/套接字写入时的阻塞或死循环问题,并引入了 std::atomic<bool> 来实现线程间的安全通信。整体思路是正确的,但在语法逻辑、代码质量、代码性能和代码安全方面,还有一些值得注意和改进的地方。

以下是详细的审查意见:

1. 语法与逻辑

  • std::atomic 缺少初始化
    workmodule.h 中,std::atomic<bool> m_stopCapture; 没有给出默认值。在 C++ 中,std::atomic 的默认初始化是不确定的(非静态存储期),这可能导致 m_stopCapture 初始状态随机,从而引发偶发的逻辑错误。
    • 改进意见:应显式初始化为 false,即 std::atomic<bool> m_stopCapture{false};
  • write 返回值的类型不匹配
    workmodule.cpp 中,write 的返回值被赋给了 long sendSize。在 Linux/POSIX 系统中,write 返回 ssize_t,而在 Windows 中返回 int。使用 long 可能会在 32 位系统或跨平台时产生截断或符号问题。
    • 改进意见:建议使用 ssize_t sendSize = write(...)
  • EAGAIN 重试逻辑中的 countSize 更新问题
    write 返回 -1errno == EAGAIN 时,代码执行 continue,此时 countSize 没有更新,循环会再次尝试写入相同的数据,这是正确的。但是,当 write 成功写入部分数据时,countSize -= ...,此时 retryCount 被重置为 0。这意味着如果每次写入 1 字节后下一次返回 EAGAIN,重试次数永远不会达到 100,这可能导致在极端网络拥塞时死循环。
    • 改进意见:考虑是否应该在 EAGAIN 时只做纯粹的“重试次数”累加,而不在部分写入时清零;或者使用总超时时间来控制重试。

2. 代码质量

  • 暴露 public 成员破坏了封装性
    为了在 DriverManger 中能够设置 m_stopCapture,你将其访问权限从 private 提升到了 public。这破坏了面向对象的封装原则,使得任何外部类都可以随意修改此状态。
    • 改进意见:提供一个 public 的停止方法(如 requestStop()),将 m_stopCapture 保留为 private
  • 硬编码的魔术数字
    retryCount > 100 中的 100 是硬编码的魔术数字,且配合 QThread::msleep(1),这实际上构成了一个最大 100ms 的忙等待/休眠混合循环。代码可读性差,且重试策略不够灵活。
    • 改进意见:使用常量定义,如 static constexpr int MAX_RETRY_COUNT = 100;
  • QThread::msleep(1) 的精度与性能问题
    操作系统的定时器精度通常在 1~15ms 左右(取决于系统调度),msleep(1) 实际休眠时间往往远大于 1ms。如果对延迟敏感,这种休眠会导致性能下降或卡顿;如果对延迟不敏感,100次重试可能不够。
    • 改进意见:建议使用总超时时间(如 QDeadlineTimer 或记录起始时间)来控制重试,而不是简单的计数。

3. 代码性能

  • Qt::BlockingQueuedConnection 导致的阻塞
    drivermanger.cpp 中:
    m_spErollthread->m_stopCapture = true;
    QMetaObject::invokeMethod(m_spErollthread.data(), "Stop", Qt::BlockingQueuedConnection);
    BlockingQueuedConnection 会导致调用线程(可能是主线程/UI线程)阻塞,直到 ErollThreadStop 方法执行完毕。如果 Stop 方法内部需要等待 sendCapture 中的 write 循环退出,而该循环正在经历 EAGAIN 的休眠重试,主线程将被冻结数百毫秒,导致界面卡顿。
    • 改进意见:如果不需要同步等待停止完成,使用 Qt::QueuedConnection 异步调用即可;如果必须等待,确保 write 循环能够通过 m_stopCapture 迅速退出,不要在 EAGAIN 时长时间休眠。

4. 代码安全

  • 未检查 m_fileSocket 的有效性
    write(m_fileSocket, ...) 之前,没有看到对 m_fileSocket 有效性的检查。如果 m_fileSocket-1 或无效描述符,write 会立即失败并设置 errnoEBADF
    • 改进意见:在进入 while 循环前,增加 if (m_fileSocket < 0) break; 的防御性编程。
  • 信号中断(EINTR)未处理
    在 Linux 系统中,write 系统调用在接收到信号时会被中断,并设置 errnoEINTR。当前的代码只处理了 EAGAINEWOULDBLOCK,遇到 EINTR 时会直接 break 并打印错误,这导致数据传输不完整。
    • 改进意见:应该将 EINTR 视为可恢复错误,继续重试写入。

💡 综合改进建议代码

基于以上分析,我为你重构了相关代码:

1. workmodule.h

// ... 头文件保持不变 ...
#include <atomic>

class ErollThread : public QObject {
    // ...
private Q_SLOTS:
    void captureError(int err, QImageCapture::Error, const QString &errorString);
    void processCapturedImage(int id, const QImage &preview);

public:
    // 提供安全的停止接口,保持封装性
    void requestStop() { m_stopCapture.store(true); }

private:
    QScopedPointer<QCamera> m_camera;
    QScopedPointer<QImageCapture> m_imageCapture;
    QString m_actionId;
    int m_fileSocket;
    bool m_bFirst;
    
    // 使用 atomic 并显式初始化,设为 private
    std::atomic<bool> m_stopCapture{false}; 
    
    bool m_checkDone;
};

2. workmodule.cpp (sendCapture 函数的 while 循环部分)

    // ... 前置代码 ...
    if (m_fileSocket < 0) {
        qWarning() << "Invalid socket descriptor";
        free(buf);
        return;
    }

    unsigned long countSize = size;
    // 建议使用超时机制替代简单的重试次数
    QElapsedTimer timer;
    timer.start();
    const qint64 MAX_RETRY_TIMEOUT_MS = 100; // 最大重试时间 100ms

    while (countSize > 0 && !m_stopCapture.load()) {
        ssize_t sendSize = write(m_fileSocket, &buf[size - countSize], static_cast<size_t>(countSize));
        
        if (sendSize < 0) {
            int err = errno;
            if (err == EINTR) {
                // 被信号中断,直接重试
                continue;
            } else if (err == EAGAIN || err == EWOULDBLOCK) {
                // 非阻塞模式下资源暂不可用,检查超时
                if (timer.elapsed() > MAX_RETRY_TIMEOUT_MS) {
                    qWarning() << "Write timeout (EAGAIN) after" << MAX_RETRY_TIMEOUT_MS << "ms";
                    break;
                }
                QThread::msleep(1); // 也可以使用更轻量的 QThread::yieldCurrentThread()
                continue;
            } else {
                // 其他不可恢复的错误(如 EPIPE, EBADF 等)
                qWarning() << "write error:" << strerror(err);
                break;
            }
        }
        
        // 写入成功,更新剩余大小
        countSize -= static_cast<unsigned long>(sendSize);
    }
    free(buf);
    // ... 后置代码 ...

3. drivermanger.cpp

    qDebug() << "start Erollthread stop";
    // 使用封装好的接口,而不是直接暴露变量
    m_spErollthread->requestStop();
    
    // 评估是否真的需要 BlockingQueuedConnection
    // 如果不需要阻塞当前线程,建议改为 Qt::QueuedConnection
    QMetaObject::invokeMethod(m_spErollthread.data(), "Stop",
                              Qt::QueuedConnection); 
    m_actionMap.remove(actionId);

这些改进将使你的代码更加健壮、安全且易于维护,特别是在处理复杂的套接字 I/O 和多线程交互时。如果有任何疑问,欢迎随时提问!

Comment thread workmodule.cpp Outdated
@deepin-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fly602, robertkill

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fly602
Copy link
Copy Markdown
Contributor Author

fly602 commented May 14, 2026

/forcemerge

@deepin-bot
Copy link
Copy Markdown
Contributor

deepin-bot Bot commented May 14, 2026

This pr force merged! (status: unstable)

@deepin-bot deepin-bot Bot merged commit d8ee22b into linuxdeepin:master May 14, 2026
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants