Python: Fix response_format resolution in streaming path to respect default_options#4307
Python: Fix response_format resolution in streaming path to respect default_options#4307moonbox3 wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…icrosoft#4300) The streaming path in Agent.run(stream=True) used the raw options kwarg instead of the merged chat_options (which includes default_options) when building the finalizer for response_format parsing. This caused AgentResponse.value to be None when response_format was set via default_options. Fix by reading response_format from the merged ctx['chat_options'] (populated by _prepare_run_context) instead of the raw options parameter. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
moonbox3
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 3 | Confidence: 80%
✗ Correctness
The new
_get_merged_response_formathelper has a short-circuit return whenctxis notNone: it returnsctx["chat_options"].get("response_format")immediately, even if that value isNone. This means that when a context exists but does not carryresponse_format(e.g. the caller setresponse_formatonly viadefault_options/optionsand the context was created without it), the fallback tooptions.get("response_format")is skipped andNoneis returned. This defeats the stated "merged" intent and can silently drop a user-providedresponse_format. Additionally,ctx["chat_options"]uses bracket access with no guard, so aKeyErrorwill be raised ifchat_optionsis ever absent from the context dict.
✗ Security Reliability
The change introduces a merged response_format lookup that prefers the context over caller-supplied options. The main reliability concern is that
ctx["chat_options"]uses bracket-key access, which will raise an unhandledKeyErrorif the context dict exists but lacks achat_optionskey—crashing mid-stream with no recovery. Otherwise the change is straightforward and the test coverage is adequate.
✗ Test Coverage
The two new tests cover the happy path for both streaming and non-streaming runs when
default_optionssuppliesresponse_format. However, the core behavioural change in_get_merged_response_format()introduces a priority rule (ctx chat_options wins overoptions) and a subtle edge case (when ctx exists but has noresponse_format, the function returnsNoneinstead of falling back tooptions). Neither of these scenarios has a corresponding test, leaving the most interesting merge/priority logic unverified.
Blocking Issues
- _get_merged_response_format returns early with None when ctx is present but ctx["chat_options"] has no response_format key, skipping the fallback to options. This can silently discard a user-supplied response_format.
- _get_merged_response_format accesses ctx["chat_options"] with bracket notation; if ctx is non-None but missing the "chat_options" key, an unhandled KeyError will propagate and crash the response stream.
- No test covers the priority/override scenario where both
ctx["chat_options"]andoptionssupply aresponse_format. This is the key new behaviour introduced by the change and must be verified. - No test covers the edge case where
ctxis notNonebutctx["chat_options"]does not containresponse_formatwhileoptionsdoes. In the current implementation this silently returnsNoneinstead of falling back tooptions, which may be a bug — either way the behaviour should be pinned by a test.
Suggestions
- Consider using ctx.get("chat_options", {}) instead of ctx["chat_options"] to avoid a potential KeyError if the context dict ever lacks that key.
- Use defensive dict access (e.g., ctx.get("chat_options", {}).get("response_format")) so a missing key falls through gracefully instead of raising.
- Consider adding a test where
run()is called with an explicitoptions={"response_format": A}whiledefault_options={"response_format": B}to assert which one wins. - Consider adding a test where
default_optionsis set (so ctx is populated) but does not includeresponse_format, whileoptionspassed to the streaming path does include one, to verify whether the fallback is intentional.
Automated review by moonbox3's agents
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug where Agent.run(stream=True) ignored default_options["response_format"] when parsing the final streamed response into structured output. The root cause was that the streaming finalizer was constructed using only per-call options before the merged context (which includes default_options) was available. The fix defers reading response_format until finalization time by replacing the eager partial() with a closure that accesses the merged chat_options from the context.
Changes:
- Fixed streaming agent finalizer to respect
default_optionsresponse_format by deferring resolution until merged context is available - Added regression tests for both streaming and non-streaming paths with default response_format
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
python/packages/core/agent_framework/_agents.py |
Added helper functions to defer response_format resolution until merged context is available; replaced eager partial() with closure-based finalizer |
python/packages/core/tests/core/test_agents.py |
Added regression tests verifying both streaming and non-streaming runs correctly parse structured output when response_format is set via default_options |
|
Duplicate of #4291, closing. |
Motivation and Context
When using
Agent.run(stream=True), the finalizer that parses the streamed response into a structured value only checked the per-calloptionsforresponse_format, completely ignoringdefault_options. This meant agents configured with a defaultresponse_format(e.g., a Pydantic model) would never parse the final streamed value, even though non-streaming runs worked correctly.Fixes #4300
Description
The root cause was that the streaming finalizer was constructed via
partial()at stream-creation time using only the per-calloptionsdict, before the merged context (which includesdefault_options) was available. The fix replaces the eagerpartialwith a closure (_finalize_with_merged_options) that defers readingresponse_formatuntil finalization time, when the merged context (ctx) is populated. It first checks the mergedchat_optionsfrom the context and falls back to the per-calloptions, matching the resolution order used in the non-streaming path. Two new tests verify that both streaming and non-streaming runs correctly parse structured output whenresponse_formatis set viadefault_options.Contribution Checklist