Skip to content

Python: Fix response_format resolution in streaming path to respect default_options#4307

Closed
moonbox3 wants to merge 1 commit intomicrosoft:mainfrom
moonbox3:agent/fix-4300-2
Closed

Python: Fix response_format resolution in streaming path to respect default_options#4307
moonbox3 wants to merge 1 commit intomicrosoft:mainfrom
moonbox3:agent/fix-4300-2

Conversation

@moonbox3
Copy link
Contributor

Motivation and Context

When using Agent.run(stream=True), the finalizer that parses the streamed response into a structured value only checked the per-call options for response_format, completely ignoring default_options. This meant agents configured with a default response_format (e.g., a Pydantic model) would never parse the final streamed value, even though non-streaming runs worked correctly.

Fixes #4300

Description

The root cause was that the streaming finalizer was constructed via partial() at stream-creation time using only the per-call options dict, before the merged context (which includes default_options) was available. The fix replaces the eager partial with a closure (_finalize_with_merged_options) that defers reading response_format until finalization time, when the merged context (ctx) is populated. It first checks the merged chat_options from the context and falls back to the per-call options, matching the resolution order used in the non-streaming path. Two new tests verify that both streaming and non-streaming runs correctly parse structured output when response_format is set via default_options.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by moonbox3's agent

…icrosoft#4300)

The streaming path in Agent.run(stream=True) used the raw options kwarg
instead of the merged chat_options (which includes default_options) when
building the finalizer for response_format parsing. This caused
AgentResponse.value to be None when response_format was set via
default_options.

Fix by reading response_format from the merged ctx['chat_options']
(populated by _prepare_run_context) instead of the raw options parameter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 26, 2026 10:47
Copy link
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 3 | Confidence: 80%

✗ Correctness

The new _get_merged_response_format helper has a short-circuit return when ctx is not None: it returns ctx["chat_options"].get("response_format") immediately, even if that value is None. This means that when a context exists but does not carry response_format (e.g. the caller set response_format only via default_options / options and the context was created without it), the fallback to options.get("response_format") is skipped and None is returned. This defeats the stated "merged" intent and can silently drop a user-provided response_format. Additionally, ctx["chat_options"] uses bracket access with no guard, so a KeyError will be raised if chat_options is ever absent from the context dict.

✗ Security Reliability

The change introduces a merged response_format lookup that prefers the context over caller-supplied options. The main reliability concern is that ctx["chat_options"] uses bracket-key access, which will raise an unhandled KeyError if the context dict exists but lacks a chat_options key—crashing mid-stream with no recovery. Otherwise the change is straightforward and the test coverage is adequate.

✗ Test Coverage

The two new tests cover the happy path for both streaming and non-streaming runs when default_options supplies response_format. However, the core behavioural change in _get_merged_response_format() introduces a priority rule (ctx chat_options wins over options) and a subtle edge case (when ctx exists but has no response_format, the function returns None instead of falling back to options). Neither of these scenarios has a corresponding test, leaving the most interesting merge/priority logic unverified.

Blocking Issues

  • _get_merged_response_format returns early with None when ctx is present but ctx["chat_options"] has no response_format key, skipping the fallback to options. This can silently discard a user-supplied response_format.
  • _get_merged_response_format accesses ctx["chat_options"] with bracket notation; if ctx is non-None but missing the "chat_options" key, an unhandled KeyError will propagate and crash the response stream.
  • No test covers the priority/override scenario where both ctx["chat_options"] and options supply a response_format. This is the key new behaviour introduced by the change and must be verified.
  • No test covers the edge case where ctx is not None but ctx["chat_options"] does not contain response_format while options does. In the current implementation this silently returns None instead of falling back to options, which may be a bug — either way the behaviour should be pinned by a test.

Suggestions

  • Consider using ctx.get("chat_options", {}) instead of ctx["chat_options"] to avoid a potential KeyError if the context dict ever lacks that key.
  • Use defensive dict access (e.g., ctx.get("chat_options", {}).get("response_format")) so a missing key falls through gracefully instead of raising.
  • Consider adding a test where run() is called with an explicit options={"response_format": A} while default_options={"response_format": B} to assert which one wins.
  • Consider adding a test where default_options is set (so ctx is populated) but does not include response_format, while options passed to the streaming path does include one, to verify whether the fallback is intentional.

Automated review by moonbox3's agents

@markwallace-microsoft
Copy link
Member

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _agents.py3384486%425, 429, 481, 846, 882, 898, 942, 981–984, 1045–1047, 1168, 1184, 1186, 1199, 1205, 1241, 1243, 1252–1257, 1262, 1264, 1270–1271, 1278, 1280–1281, 1289–1290, 1293–1295, 1303–1304, 1306, 1311, 1313
TOTAL22181276387% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4674 247 💤 0 ❌ 0 🔥 1m 20s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where Agent.run(stream=True) ignored default_options["response_format"] when parsing the final streamed response into structured output. The root cause was that the streaming finalizer was constructed using only per-call options before the merged context (which includes default_options) was available. The fix defers reading response_format until finalization time by replacing the eager partial() with a closure that accesses the merged chat_options from the context.

Changes:

  • Fixed streaming agent finalizer to respect default_options response_format by deferring resolution until merged context is available
  • Added regression tests for both streaming and non-streaming paths with default response_format

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
python/packages/core/agent_framework/_agents.py Added helper functions to defer response_format resolution until merged context is available; replaced eager partial() with closure-based finalizer
python/packages/core/tests/core/test_agents.py Added regression tests verifying both streaming and non-streaming runs correctly parse structured output when response_format is set via default_options

@moonbox3
Copy link
Contributor Author

Duplicate of #4291, closing.

@moonbox3 moonbox3 closed this Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Agent.run(stream=True) ignores default_options response_format for final value parsing

3 participants