Skip to content

Python: Workflows background run#4274

Open
TaoChenOSU wants to merge 6 commits intomicrosoft:mainfrom
TaoChenOSU:taochen/python-workflows-background-run
Open

Python: Workflows background run#4274
TaoChenOSU wants to merge 6 commits intomicrosoft:mainfrom
TaoChenOSU:taochen/python-workflows-background-run

Conversation

@TaoChenOSU
Copy link
Contributor

@TaoChenOSU TaoChenOSU commented Feb 25, 2026

Motivation and Context

Currently, we have a very limited way of handling workflow runs and responding to events. Users have to wait until a workflow converges to process events, such as requests.

Description

  1. Create a run mode where clients can start a workflow run in the background and poll events as they desire.
  2. Allow clients to respond to requests while the workflow is still running in the background.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

@TaoChenOSU TaoChenOSU self-assigned this Feb 25, 2026
@TaoChenOSU TaoChenOSU added python workflows Related to Workflows in agent-framework labels Feb 25, 2026
@github-actions github-actions bot changed the title Workflows background run Python: Workflows background run Feb 25, 2026
@markwallace-microsoft markwallace-microsoft added the documentation Improvements or additions to documentation label Feb 25, 2026
@TaoChenOSU TaoChenOSU marked this pull request as ready for review February 25, 2026 21:38
Copilot AI review requested due to automatic review settings February 25, 2026 21:38
@TaoChenOSU TaoChenOSU moved this to In Progress in Agent Framework Feb 25, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

finally:
await self._run_cleanup(checkpoint_storage)

async def _resume() -> asyncio.Task[None]: # noqa: RUF029
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The # noqa: RUF029 comment is unnecessary here. RUF029 warns about unused async functions, but _resume is clearly used on line 634 where it's passed to BackgroundRunHandle. This suppression should be removed.

Suggested change
async def _resume() -> asyncio.Task[None]: # noqa: RUF029
async def _resume() -> asyncio.Task[None]:

Copilot uses AI. Check for mistakes.
from ._events import WorkflowEvent
from ._runner_context import RunnerContext

logger = logging.getLogger(__name__)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger variable is imported but never used in this module. Consider removing the unused import or adding appropriate debug/error logging where it might be useful (e.g., in the respond method when resuming after idle).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts.

logger = logging.getLogger(__name__)


class BackgroundRunHandle:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this class is essentially a wrapper around asyncio primitives: create_task and an asyncio.Queue. Why do we need to wrap these well-known constructs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this also adds another concept that people have to learn, I do not think this is needed

return response_stream
return response_stream.get_final_response()

def run_in_background(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we genuinely need a polling-based consumption pattern? Are we getting feedback that this is missing today?

We're now pushing more concerns onto the caller. Every consumer has to:

  1. Write the poll loop
  2. Pick a sleep interval (and get it "wrong", too slow adds latency, too fast wastes cycles)
  3. Route events by type manually
  4. Track which request IDs map to which responses
  5. Remember to drain after idle
  6. Handle the resume-after-idle edge case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much agree, this is not needed and leads to un-pythonic code

logger = logging.getLogger(__name__)


class BackgroundRunHandle:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this also adds another concept that people have to learn, I do not think this is needed

# Single poll loop: process all events and respond to requests inline.
# The workflow continues running in the background while we process events.
outputs: list[str] = []
while not handle.is_idle:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see almost no value in this, because it makes people have to understand this whole idle, poll, etc. Also a user having to write asyncio.sleep does not seem right, a much simpler pattern to solve for this scenario is to use something like a callback for response required, that's a well known concept in Python and doesn't require as much complexity as this. And when I compare these samples to existing sample that just use streaming then that is enough, that already allows you to do other stuff in the meantime, and if you really need it, a user can call that in a separate thread and then you ahve this with Python primitives instead of another new thing.

return response_stream
return response_stream.get_final_response()

def run_in_background(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much agree, this is not needed and leads to un-pythonic code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python workflows Related to Workflows in agent-framework

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

5 participants