Fix an issue where LLM responses are not streamed or rendered properly in the AI Assistant. Fixes #9734#9735
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds end-to-end LLM response streaming: provider/client streaming APIs, a streaming chat pipeline in NLQ backend, SSE incremental events, frontend incremental Markdown/code rendering for streaming responses, tests for extraction/streaming, theme hyperlink tokens, and a release-note entry for issue Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant NLQ_UI as NLQ Chat UI
participant Backend as NLQ Backend
participant ChatPipe as Chat Pipeline
participant Provider as LLM Provider Client
participant LLM as LLM Service
User->>NLQ_UI: Submit question
NLQ_UI->>Backend: POST user message
Backend->>ChatPipe: chat_with_database_stream(...)
ChatPipe->>Provider: chat_stream(messages, tools, system_prompt)
Provider->>LLM: HTTP streaming request
LLM-->>Provider: Stream chunks (SSE/NDJSON)
loop per chunk
alt text chunk
Provider-->>ChatPipe: yield "text" (string)
ChatPipe-->>Backend: emit SSE {type: "text_delta", content}
Backend-->>NLQ_UI: SSE text_delta
NLQ_UI->>NLQ_UI: parseMarkdownSegments + render
else tool call
Provider-->>ChatPipe: yield ('tool_use', ...)
ChatPipe-->>Backend: emit SSE {type: "thinking"}
Backend-->>NLQ_UI: SSE thinking
end
end
LLM-->>Provider: stream end + final response
Provider-->>ChatPipe: yield final LLMResponse
ChatPipe->>ChatPipe: extract SQL from fenced code blocks (or JSON fallback)
ChatPipe-->>Backend: final tuple (response_text, history)
Backend-->>NLQ_UI: SSE {type: "complete", content, sql}
NLQ_UI->>User: render final content + SQL
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (3)
web/pgadmin/llm/providers/ollama.py (2)
321-329: Preserve exception context withraise ... from e.The exception re-raise loses the original traceback. Use
from e(orfrom Noneif you intentionally want to suppress the original) to maintain the exception chain for easier debugging.♻️ Proposed fix
try: yield from self._process_stream(payload) except LLMClientError: raise except Exception as e: raise LLMClientError(LLMError( - message=f"Streaming request failed: {str(e)}", + message=f"Streaming request failed: {e!s}", provider=self.provider_name - )) + )) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/ollama.py` around lines 321 - 329, The exception handler in the streaming path currently raises a new LLMClientError losing the original traceback; update the except Exception as e block to re-raise the constructed LLMClientError using exception chaining (raise ... from e) so the original exception context is preserved; apply this change where the yield from self._process_stream(payload) is wrapped and the LLMClientError(LLMError(..., provider=self.provider_name)) is created so the new raise uses "from e" to keep the original traceback.
346-364: Preserve exception context in_process_streamerror handlers.Both HTTPError and URLError handlers lose the original exception context. Add
from eto maintain the traceback chain.♻️ Proposed fix
except urllib.error.HTTPError as e: error_body = e.read().decode('utf-8') try: error_data = json.loads(error_body) error_msg = error_data.get('error', str(e)) except json.JSONDecodeError: error_msg = error_body or str(e) raise LLMClientError(LLMError( message=error_msg, code=str(e.code), provider=self.provider_name, retryable=e.code in (429, 500, 502, 503, 504) - )) + )) from e except urllib.error.URLError as e: raise LLMClientError(LLMError( message=f"Cannot connect to Ollama: {e.reason}", provider=self.provider_name, retryable=True - )) + )) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/ollama.py` around lines 346 - 364, In _process_stream, the except handlers for urllib.error.HTTPError and urllib.error.URLError (the blocks that build and raise LLMClientError wrapping LLMError) are losing the original exception context; update both raises to use exception chaining (raise LLMClientError(...) from e) so the original traceback is preserved — specifically modify the HTTPError handler that parses error_body/error_data and raises LLMClientError(LLMError(...)) and the URLError handler that raises LLMClientError(LLMError(...)) to append "from e".web/pgadmin/llm/chat.py (1)
157-166: Generator return type annotation could be more precise.The type annotation
Generator[Union[str, tuple[str, list[Message]]], None, None]doesn't capture the('tool_use', list[str])tuple variant that's yielded at line 229. Consider using a more explicit union or documenting the intermediate tuple type.💡 Suggested type improvement
+# Type alias for streaming yields +StreamYield = Union[str, tuple[str, list[str]], tuple[str, list[Message]]] + def chat_with_database_stream( user_message: str, sid: int, did: int, conversation_history: Optional[list[Message]] = None, system_prompt: Optional[str] = None, max_tool_iterations: Optional[int] = None, provider: Optional[str] = None, model: Optional[str] = None -) -> Generator[Union[str, tuple[str, list[Message]]], None, None]: +) -> Generator[StreamYield, None, None]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/chat.py` around lines 157 - 166, The Generator return type for chat_with_database_stream is too broad and doesn't include the ('tool_use', list[str]) intermediate yield; update the annotation to explicitly enumerate all yielded variants (e.g. include Tuple[Literal['tool_use'], List[str]] and the existing Tuple[str, List[Message]] plus str) using typing.Tuple, typing.List and typing.Literal (or a small Union alias) so the signature reads something like Generator[Union[str, Tuple[Literal['tool_use'], List[str]], Tuple[str, List[Message]]], None, None]; change only the annotation on chat_with_database_stream and import any required typing names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 422-457: The streaming handler inside _parse_response currently
concatenates every text_delta which can collapse boundaries between content
blocks; modify the event handling around event_type == 'content_block_start' /
'content_block_delta' / 'content_block_stop' to preserve block separators by
tracking when a new content block begins (use current_tool_block or a new flag
like in_block) and, if content_parts already contains data, emit a separator
(e.g., '\n\n' or the same separator used when the top-level response joins
blocks) before yielding subsequent text deltas; ensure this logic integrates
with the existing variables content_parts, tool_input_json and yield text so the
final LLMResponse reconstruction retains original block boundaries.
In `@web/pgadmin/llm/providers/docker.py`:
- Around line 535-566: The streaming finalizer currently always yields an
LLMResponse (the block constructing content, tool_calls, stop_reason and
yielding LLMResponse), which can convert stream failures into empty assistant
messages; change it to mirror _parse_response() safeguards by: after building
content, tool_calls (using the same json.loads with JSONDecodeError fallback as
used for ToolCall construction) and mapping finish_reason via stop_reason_map,
detect error/empty-stream conditions (no content and no tool_calls OR
finish_reason indicates an error/blank result) and raise LLMClientError instead
of yielding an empty LLMResponse; otherwise yield the LLMResponse as before,
keeping the same fields (content, tool_calls, stop_reason, model, usage).
- Around line 411-423: The code currently uses self._api_url directly for
server-side requests; add a shared helper (e.g., validate_loopback_api_url(url))
and call it from both the sync and streaming request paths before constructing
the urllib Request. The helper should parse the URL (urllib.parse.urlparse),
ensure scheme is 'http' or 'https', and ensure the netloc hostname is loopback
(127.0.0.1, ::1, or 'localhost' and their bracketed IPv6 forms) — otherwise
raise ValueError or return an error so the request is rejected; update the
places that use self._api_url (the streaming path and the sync path that builds
url = f'{self._api_url}/engines/v1/chat/completions') to call this helper and
fail-fast if validation fails.
In `@web/pgadmin/llm/providers/openai.py`:
- Around line 524-556: The streaming finalizer currently always yields an
LLMResponse even if the stream produced no usable output; mirror the safeguards
in _parse_response() by validating content_parts, tool_calls_data and
finish_reason before yielding: after building content = ''.join(content_parts)
and tool_calls from tool_calls_data, if both content is empty and tool_calls is
empty or the finish_reason indicates an error, raise the same exception
type/logic used by _parse_response() (instead of yielding) so callers receive an
error; use the same variables (content_parts, tool_calls_data, finish_reason,
model_name, usage) and only yield LLMResponse when those checks pass.
In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2929-2931: The current join that builds the variable sql from
sql_blocks uses "';\n\n'.join(block.strip() for block in sql_blocks)" which can
produce double-semicolons if blocks already end with ';'; update the logic that
constructs sql to first strip trailing semicolons and whitespace from each block
(e.g. normalize each block with something like
block.rstrip().rstrip(';').strip()) before joining, then join with a single
';\n\n' to ensure only a single semicolon separates blocks; modify the
assignment that computes sql (the sql and sql_blocks variables) accordingly.
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 193-201: The SQL extraction is joining blocks that already include
trailing semicolons, causing duplicated semicolons; update the join that assigns
to the variable named sql (the expression iterating over sql_blocks) to remove
trailing semicolons from each block before joining (e.g., call strip() then
rstrip(';') on each block) so the final join uses cleaned blocks separated by
';\n\n'.
---
Nitpick comments:
In `@web/pgadmin/llm/chat.py`:
- Around line 157-166: The Generator return type for chat_with_database_stream
is too broad and doesn't include the ('tool_use', list[str]) intermediate yield;
update the annotation to explicitly enumerate all yielded variants (e.g. include
Tuple[Literal['tool_use'], List[str]] and the existing Tuple[str, List[Message]]
plus str) using typing.Tuple, typing.List and typing.Literal (or a small Union
alias) so the signature reads something like Generator[Union[str,
Tuple[Literal['tool_use'], List[str]], Tuple[str, List[Message]]], None, None];
change only the annotation on chat_with_database_stream and import any required
typing names.
In `@web/pgadmin/llm/providers/ollama.py`:
- Around line 321-329: The exception handler in the streaming path currently
raises a new LLMClientError losing the original traceback; update the except
Exception as e block to re-raise the constructed LLMClientError using exception
chaining (raise ... from e) so the original exception context is preserved;
apply this change where the yield from self._process_stream(payload) is wrapped
and the LLMClientError(LLMError(..., provider=self.provider_name)) is created so
the new raise uses "from e" to keep the original traceback.
- Around line 346-364: In _process_stream, the except handlers for
urllib.error.HTTPError and urllib.error.URLError (the blocks that build and
raise LLMClientError wrapping LLMError) are losing the original exception
context; update both raises to use exception chaining (raise LLMClientError(...)
from e) so the original traceback is preserved — specifically modify the
HTTPError handler that parses error_body/error_data and raises
LLMClientError(LLMError(...)) and the URLError handler that raises
LLMClientError(LLMError(...)) to append "from e".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 45878a7f-70de-4cc0-82bd-4bb0c8b48eb3
📒 Files selected for processing (14)
docs/en_US/release_notes_9_14.rstweb/pgadmin/llm/chat.pyweb/pgadmin/llm/client.pyweb/pgadmin/llm/prompts/nlq.pyweb/pgadmin/llm/providers/anthropic.pyweb/pgadmin/llm/providers/docker.pyweb/pgadmin/llm/providers/ollama.pyweb/pgadmin/llm/providers/openai.pyweb/pgadmin/static/js/Theme/dark.jsweb/pgadmin/static/js/Theme/high_contrast.jsweb/pgadmin/static/js/Theme/light.jsweb/pgadmin/tools/sqleditor/__init__.pyweb/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsxweb/pgadmin/tools/sqleditor/tests/test_nlq_chat.py
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
web/pgadmin/tools/sqleditor/__init__.py (1)
2875-2877:⚠️ Potential issue | 🟠 Major
conversation_idis stateless right now.
chat_with_database_stream()returns(response_text, messages), but this handler dropsmessagesand only echoes/generates an ID. Follow-up turns therefore cannot replay prior tool calls/results, so multi-turn NLQ chats will silently lose context. Persist the returned history byconversation_id, or return it to the client and feed it back intoconversation_historyon the next request.Also applies to: 2919-2954
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/__init__.py` around lines 2875 - 2877, The handler currently discards the messages returned by chat_with_database_stream() and only emits a generated conversation_id, losing tool-call history; update the request flow in the sqleditor handler to persist or round-trip the returned conversation history: capture the (response_text, messages) tuple from chat_with_database_stream(), then either (a) store messages keyed by conversation_id in a persistent/in-memory store (so future requests can load conversation_history by conversation_id) or (b) include the messages in the JSON response to the client and accept conversation_history on the next request; apply the same change to the identical logic block referenced around lines 2919-2954 so conversation_history is preserved across turns (use the symbols conversation_id, conversation_history, and chat_with_database_stream to locate and modify the code).
♻️ Duplicate comments (1)
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
185-242:⚠️ Potential issue | 🟠 MajorMirror the production SQL normalization in this test.
This helper still uses
block.strip(), while the endpoint now usesblock.strip().rstrip(';'), which is why CI is hittingSELECT * FROM users;;. Update the join here to match production and drop the trailing-semicolon expectations for the single/final block cases as well.🔧 Proposed fix
('SQL Extraction - Single SQL block', dict( response_text=( 'Here is the query:\n\n' '```sql\nSELECT * FROM users;\n```\n\n' 'This returns all users.' ), - expected_sql='SELECT * FROM users;' + expected_sql='SELECT * FROM users' )), @@ ('SQL Extraction - pgsql language tag', dict( response_text='```pgsql\nSELECT 1;\n```', - expected_sql='SELECT 1;' + expected_sql='SELECT 1' )), ('SQL Extraction - postgresql language tag', dict( response_text='```postgresql\nSELECT 1;\n```', - expected_sql='SELECT 1;' + expected_sql='SELECT 1' )), @@ expected_sql=( 'SELECT u.name, o.total\n' 'FROM users u\n' 'JOIN orders o ON u.id = o.user_id\n' - 'WHERE o.total > 100;' + 'WHERE o.total > 100' ) )), @@ sql = ';\n\n'.join( - block.strip() for block in sql_blocks + block.strip().rstrip(';') for block in sql_blocks ) if sql_blocks else NoneAlso applies to: 253-261
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 185 - 242, The test expectations and the SQL-joining helper in test_nlq_chat.py must mirror production normalization: change the join that builds sql from sql_blocks to strip trailing semicolons (use block.strip().rstrip(';') when joining into the variable sql) and update expected_sql strings for the cases 'SQL Extraction - Single SQL block', 'SQL Extraction - pgsql language tag', 'SQL Extraction - postgresql language tag' and the multiline case so they no longer include a trailing semicolon (e.g. 'SELECT * FROM users' instead of 'SELECT * FROM users;'); apply the same change to the other helper occurrence referenced around lines 253-261.
🧹 Nitpick comments (11)
web/pgadmin/llm/providers/openai.py (3)
383-387: Preserve exception chain for debugging.When re-raising exceptions, use
raise ... from eto preserve the original traceback. This pattern appears multiple times in the streaming methods.♻️ Suggested fix
except Exception as e: raise LLMClientError(LLMError( - message=f"Streaming request failed: {str(e)}", + message=f"Streaming request failed: {e!s}", provider=self.provider_name - )) + )) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/openai.py` around lines 383 - 387, The exception handling in the OpenAI streaming code wraps errors into LLMClientError/LLMError but drops the original traceback; update the except blocks (e.g. the handler that currently does "except Exception as e: raise LLMClientError(LLMError(...))" which references self.provider_name and constructs LLMError) to re-raise using "raise LLMClientError(LLMError(...)) from e" so the original exception chain is preserved; apply the same change to all similar streaming-method handlers in this file that catch Exception and raise LLMClientError.
550-555: The empty response safeguard is implemented - consider aligning error detail with non-streaming path.The check for empty responses now correctly raises an error, addressing the previous review concern. However, the non-streaming
_parse_responseprovides more specific error messages based onfinish_reason(e.g., detailed guidance whenMAX_TOKENSis hit about model context limits). Consider mirroring that logic for consistency:♻️ Optional: align error handling with _parse_response
if not content and not tool_calls: + if stop_reason == StopReason.MAX_TOKENS: + raise LLMClientError(LLMError( + message=f'Response truncated due to token limit ' + f'(input: {usage.input_tokens} tokens). ' + f'The request is too large for model ' + f'{self._model}. ' + f'Try using a model with a larger context ' + f'window, or analyze a smaller scope.', + code='max_tokens', + provider=self.provider_name, + retryable=False + )) + elif finish_reason and finish_reason not in ('stop', 'tool_calls'): + raise LLMClientError(LLMError( + message=f'Empty response with finish reason: {finish_reason}', + code=finish_reason, + provider=self.provider_name, + retryable=False + )) raise LLMClientError(LLMError( message='No response content returned from API', provider=self.provider_name, retryable=False ))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/openai.py` around lines 550 - 555, The empty-response check currently raises a generic LLMClientError; update it to mirror the non-streaming _parse_response logic by inspecting the response's finish_reason (and any available model error metadata) before raising so you emit the same detailed guidance (e.g., MAX_TOKENS / context-limit hints) and include provider=self.provider_name and any diagnostic fields; modify the block that raises LLMClientError (the conditional that checks content and tool_calls) to build the LLMError message using the same finish_reason-driven messages and details used in _parse_response so streaming and non-streaming paths produce consistent errors.
418-436: Add exception chaining to preserve tracebacks.Similar to the catch-all handler above, these exception handlers should chain the original exception.
♻️ Suggested fixes
except urllib.error.HTTPError as e: # ... error parsing ... raise LLMClientError(LLMError( message=error_msg, code=str(e.code), provider=self.provider_name, retryable=e.code in (429, 500, 502, 503, 504) - )) + )) from e except urllib.error.URLError as e: raise LLMClientError(LLMError( message=f"Connection error: {e.reason}", provider=self.provider_name, retryable=True - )) + )) from e except socket.timeout: raise LLMClientError(LLMError( message="Request timed out.", code='timeout', provider=self.provider_name, retryable=True - )) + )) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/openai.py` around lines 418 - 436, The exception handlers that raise LLMClientError/LLMError should preserve original tracebacks by using exception chaining: in the HTTP/URLError handlers add "raise LLMClientError(... ) from e" (they already reference e), and change "except socket.timeout:" to "except socket.timeout as e:" then raise the LLMClientError with "from e". Ensure this is applied to the handlers that build LLMError (the ones referencing e.code, the urllib.error.URLError handler, and the socket.timeout handler) so the original exception is chained to the new LLMClientError.web/pgadmin/llm/providers/docker.py (5)
51-64:ValueErrorpropagation may be inconsistent with caller expectations.From the context in
client.py,get_llm_client()instantiatesDockerClientwithout catchingValueError. Other error paths in this module raiseLLMClientError. AValueErrorhere will propagate as an unexpected exception type to higher-level handlers.Consider wrapping the validation in
__init__to convertValueErrortoLLMClientError, or catch it inget_llm_client()for consistency.♻️ Option: Wrap in __init__ for consistency
self._api_url = (api_url or DEFAULT_API_URL).rstrip('/') - _validate_loopback_url(self._api_url) + try: + _validate_loopback_url(self._api_url) + except ValueError as e: + raise LLMClientError(LLMError( + message=str(e), + provider='docker', + retryable=False + )) from e self._model = model or DEFAULT_MODEL🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/docker.py` around lines 51 - 64, The _validate_loopback_url function currently raises ValueError which can leak to callers; convert that into the module's LLMClientError before it escapes by catching ValueError in the DockerClient initialization (DockerClient.__init__) or by having get_llm_client() wrap the call that triggers _validate_loopback_url in a try/except that catches ValueError and re-raises LLMClientError with a clear message; ensure you reference _validate_loopback_url, DockerClient, get_llm_client, and LLMClientError so callers always receive LLMClientError for validation failures.
418-422: Add exception chaining for better debugging.Use
raise ... from eto preserve the original exception traceback, which helps with debugging.♻️ Proposed fix
except Exception as e: raise LLMClientError(LLMError( - message=f"Streaming request failed: {str(e)}", + message=f"Streaming request failed: {e!s}", provider=self.provider_name - )) + )) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/docker.py` around lines 418 - 422, The except block that currently raises LLMClientError(LLMError(...)) should preserve the original exception traceback by using exception chaining; modify the raise to use "raise LLMClientError(LLMError(message=f'Streaming request failed: {str(e)}', provider=self.provider_name)) from e" so the original exception e is attached for better debugging.
515-521: Simplify redundant key check.The
if 'usage' in data and data['usage']check is redundant. Usingdata.get('usage')achieves the same result more concisely.♻️ Proposed fix
- if 'usage' in data and data['usage']: - u = data['usage'] + usage_data = data.get('usage') + if usage_data: usage = Usage( - input_tokens=u.get('prompt_tokens', 0), - output_tokens=u.get('completion_tokens', 0), - total_tokens=u.get('total_tokens', 0) + input_tokens=usage_data.get('prompt_tokens', 0), + output_tokens=usage_data.get('completion_tokens', 0), + total_tokens=usage_data.get('total_tokens', 0) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/docker.py` around lines 515 - 521, Replace the redundant check "if 'usage' in data and data['usage']" with a single call using data.get('usage') to retrieve the usage dict (e.g., u = data.get('usage')), then proceed to construct the Usage(...) only when u is truthy; update the block around the variables data, u, and Usage to use this simplified check so behavior remains identical but the code is more concise.
454-474: Add exception chaining in error handlers.Multiple
exceptblocks re-raise without preserving the original exception chain. Addingfrom eorfrom Noneimproves debuggability.♻️ Proposed fix for exception chaining
raise LLMClientError(LLMError( message=error_msg, code=str(e.code), provider=self.provider_name, retryable=e.code in (429, 500, 502, 503, 504) - )) + )) from e except urllib.error.URLError as e: raise LLMClientError(LLMError( message=f"Connection error: {e.reason}. " f"Is Docker Model Runner running at " f"{self._api_url}?", provider=self.provider_name, retryable=True - )) + )) from e except socket.timeout: raise LLMClientError(LLMError( message="Request timed out.", code='timeout', provider=self.provider_name, retryable=True - )) + )) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/docker.py` around lines 454 - 474, The except blocks that raise LLMClientError wrapping an LLMError (the urllib.error.URLError and socket.timeout handlers and the earlier exception handler that uses variable e) must preserve exception chaining so the original traceback is kept; update each raise to use exception chaining (e.g., raise LLMClientError(LLMError(...)) from e for handlers that capture exception as e, and use an explicit "from None" only where you intentionally want to suppress context), targeting the raise statements that construct LLMClientError/LLMError in the docker provider method where urllib.error.URLError, socket.timeout, and the earlier exception variable e are handled.
581-594: Consider mirroring theMAX_TOKENStruncation check from_parse_response.The empty-stream check is good, but
_parse_response()(lines 344-372) has additional logic that distinguishesMAX_TOKENStruncation with a specific error message including token counts. The streaming path could benefit from similar handling for better user feedback when responses are truncated.♻️ Example enhancement
if not content and not tool_calls: + if stop_reason == StopReason.MAX_TOKENS: + raise LLMClientError(LLMError( + message=( + f'Response truncated due to token limit ' + f'(input: {usage.input_tokens} tokens). ' + f'Try using a model with a larger context ' + f'window, or analyze a smaller scope.' + ), + code='max_tokens', + provider=self.provider_name, + retryable=False + )) raise LLMClientError(LLMError( message='No response content returned from API', provider=self.provider_name, retryable=False ))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/docker.py` around lines 581 - 594, When handling the streaming path after the empty-stream check, mirror the MAX_TOKENS truncation handling from _parse_response by detecting when stop_reason (or usage) indicates the model hit the token limit (compare against MAX_TOKENS) and surface the same specific LLMClientError message that includes token counts; update the block that currently yields LLMResponse (referencing content, tool_calls, stop_reason, model_name, usage) to either raise LLMClientError with the detailed "truncated by MAX_TOKENS" message (like _parse_response) or augment the yielded response to include that truncation detail so callers get identical feedback. Ensure you use the same MAX_TOKENS constant and message format as in _parse_response and preserve existing behavior for non-truncated responses.web/pgadmin/llm/providers/anthropic.py (2)
308-314: Add exception chaining to preserve traceback context.When re-raising exceptions, use
from e(orfrom Noneif intentionally suppressing) to maintain the exception chain. This helps with debugging by preserving the original traceback.♻️ Proposed fix
try: yield from self._process_stream(payload) except LLMClientError: raise except Exception as e: raise LLMClientError(LLMError( message=f"Streaming request failed: {e}", provider=self.provider_name - )) + )) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/anthropic.py` around lines 308 - 314, The except block catching Exception in the Anthropc provider currently raises a new LLMClientError without chaining the original exception; update the catch in the method handling streaming (the block referencing LLMClientError, LLMError and self.provider_name) to re-raise the new LLMClientError using "from e" (i.e., raise LLMClientError(LLMError(...)) from e) so the original traceback is preserved for debugging.
337-364: Add exception chaining to all exception handlers.All three exception handlers re-raise without chaining, which loses the original traceback context.
♻️ Proposed fix
except urllib.error.HTTPError as e: error_body = e.read().decode('utf-8') try: error_data = json.loads(error_body) error_msg = error_data.get( 'error', {} ).get('message', str(e)) except json.JSONDecodeError: error_msg = error_body or str(e) raise LLMClientError(LLMError( message=error_msg, code=str(e.code), provider=self.provider_name, retryable=e.code in (429, 500, 502, 503, 504) - )) + )) from e except urllib.error.URLError as e: raise LLMClientError(LLMError( message=f"Connection error: {e.reason}", provider=self.provider_name, retryable=True - )) + )) from e except socket.timeout: raise LLMClientError(LLMError( message="Request timed out.", code='timeout', provider=self.provider_name, retryable=True - )) + )) from NoneNote: For
socket.timeout, usefrom Nonesince the original exception doesn't add useful context (the message already explains it's a timeout).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/anthropic.py` around lines 337 - 364, The exception handlers for urllib.error.HTTPError, urllib.error.URLError, and socket.timeout in the anthropic provider currently re-raise LLMClientError without chaining the original exception and thus lose traceback; update the three raises so the HTTPError and URLError handlers use "raise LLMClientError(... ) from e" (referencing urllib.error.HTTPError and urllib.error.URLError and the LLMClientError/LLMError construction and self.provider_name) and change the socket.timeout handler to "raise LLMClientError(... ) from None" per the note so the original exception context is preserved or suppressed appropriately.web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
326-332: Add atool_usestreaming scenario.This generator never emits
('tool_use', ...), so the new branch that should surface a secondthinkingevent is still untested. One scenario that injects a tool-use tuple between chunks would lock down the SSE protocol added in this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 326 - 332, The mock_stream_gen generator never emits a ('tool_use', ...) tuple so the new branch that should produce a second "thinking" event is not exercised; update mock_stream_gen to yield the same text chunks but inject a ('tool_use', <tool payload>) tuple between chunk yields (e.g., after the first or second chunk) before yielding the final (self.mock_response, []) tuple so the test for the SSE tool_use streaming path and the extra thinking event is exercised; refer to mock_stream_gen and the existing yields to locate where to insert the ('tool_use', ...) emission.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2875-2877: The handler currently discards the messages returned by
chat_with_database_stream() and only emits a generated conversation_id, losing
tool-call history; update the request flow in the sqleditor handler to persist
or round-trip the returned conversation history: capture the (response_text,
messages) tuple from chat_with_database_stream(), then either (a) store messages
keyed by conversation_id in a persistent/in-memory store (so future requests can
load conversation_history by conversation_id) or (b) include the messages in the
JSON response to the client and accept conversation_history on the next request;
apply the same change to the identical logic block referenced around lines
2919-2954 so conversation_history is preserved across turns (use the symbols
conversation_id, conversation_history, and chat_with_database_stream to locate
and modify the code).
---
Duplicate comments:
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 185-242: The test expectations and the SQL-joining helper in
test_nlq_chat.py must mirror production normalization: change the join that
builds sql from sql_blocks to strip trailing semicolons (use
block.strip().rstrip(';') when joining into the variable sql) and update
expected_sql strings for the cases 'SQL Extraction - Single SQL block', 'SQL
Extraction - pgsql language tag', 'SQL Extraction - postgresql language tag' and
the multiline case so they no longer include a trailing semicolon (e.g. 'SELECT
* FROM users' instead of 'SELECT * FROM users;'); apply the same change to the
other helper occurrence referenced around lines 253-261.
---
Nitpick comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 308-314: The except block catching Exception in the Anthropc
provider currently raises a new LLMClientError without chaining the original
exception; update the catch in the method handling streaming (the block
referencing LLMClientError, LLMError and self.provider_name) to re-raise the new
LLMClientError using "from e" (i.e., raise LLMClientError(LLMError(...)) from e)
so the original traceback is preserved for debugging.
- Around line 337-364: The exception handlers for urllib.error.HTTPError,
urllib.error.URLError, and socket.timeout in the anthropic provider currently
re-raise LLMClientError without chaining the original exception and thus lose
traceback; update the three raises so the HTTPError and URLError handlers use
"raise LLMClientError(... ) from e" (referencing urllib.error.HTTPError and
urllib.error.URLError and the LLMClientError/LLMError construction and
self.provider_name) and change the socket.timeout handler to "raise
LLMClientError(... ) from None" per the note so the original exception context
is preserved or suppressed appropriately.
In `@web/pgadmin/llm/providers/docker.py`:
- Around line 51-64: The _validate_loopback_url function currently raises
ValueError which can leak to callers; convert that into the module's
LLMClientError before it escapes by catching ValueError in the DockerClient
initialization (DockerClient.__init__) or by having get_llm_client() wrap the
call that triggers _validate_loopback_url in a try/except that catches
ValueError and re-raises LLMClientError with a clear message; ensure you
reference _validate_loopback_url, DockerClient, get_llm_client, and
LLMClientError so callers always receive LLMClientError for validation failures.
- Around line 418-422: The except block that currently raises
LLMClientError(LLMError(...)) should preserve the original exception traceback
by using exception chaining; modify the raise to use "raise
LLMClientError(LLMError(message=f'Streaming request failed: {str(e)}',
provider=self.provider_name)) from e" so the original exception e is attached
for better debugging.
- Around line 515-521: Replace the redundant check "if 'usage' in data and
data['usage']" with a single call using data.get('usage') to retrieve the usage
dict (e.g., u = data.get('usage')), then proceed to construct the Usage(...)
only when u is truthy; update the block around the variables data, u, and Usage
to use this simplified check so behavior remains identical but the code is more
concise.
- Around line 454-474: The except blocks that raise LLMClientError wrapping an
LLMError (the urllib.error.URLError and socket.timeout handlers and the earlier
exception handler that uses variable e) must preserve exception chaining so the
original traceback is kept; update each raise to use exception chaining (e.g.,
raise LLMClientError(LLMError(...)) from e for handlers that capture exception
as e, and use an explicit "from None" only where you intentionally want to
suppress context), targeting the raise statements that construct
LLMClientError/LLMError in the docker provider method where
urllib.error.URLError, socket.timeout, and the earlier exception variable e are
handled.
- Around line 581-594: When handling the streaming path after the empty-stream
check, mirror the MAX_TOKENS truncation handling from _parse_response by
detecting when stop_reason (or usage) indicates the model hit the token limit
(compare against MAX_TOKENS) and surface the same specific LLMClientError
message that includes token counts; update the block that currently yields
LLMResponse (referencing content, tool_calls, stop_reason, model_name, usage) to
either raise LLMClientError with the detailed "truncated by MAX_TOKENS" message
(like _parse_response) or augment the yielded response to include that
truncation detail so callers get identical feedback. Ensure you use the same
MAX_TOKENS constant and message format as in _parse_response and preserve
existing behavior for non-truncated responses.
In `@web/pgadmin/llm/providers/openai.py`:
- Around line 383-387: The exception handling in the OpenAI streaming code wraps
errors into LLMClientError/LLMError but drops the original traceback; update the
except blocks (e.g. the handler that currently does "except Exception as e:
raise LLMClientError(LLMError(...))" which references self.provider_name and
constructs LLMError) to re-raise using "raise LLMClientError(LLMError(...)) from
e" so the original exception chain is preserved; apply the same change to all
similar streaming-method handlers in this file that catch Exception and raise
LLMClientError.
- Around line 550-555: The empty-response check currently raises a generic
LLMClientError; update it to mirror the non-streaming _parse_response logic by
inspecting the response's finish_reason (and any available model error metadata)
before raising so you emit the same detailed guidance (e.g., MAX_TOKENS /
context-limit hints) and include provider=self.provider_name and any diagnostic
fields; modify the block that raises LLMClientError (the conditional that checks
content and tool_calls) to build the LLMError message using the same
finish_reason-driven messages and details used in _parse_response so streaming
and non-streaming paths produce consistent errors.
- Around line 418-436: The exception handlers that raise LLMClientError/LLMError
should preserve original tracebacks by using exception chaining: in the
HTTP/URLError handlers add "raise LLMClientError(... ) from e" (they already
reference e), and change "except socket.timeout:" to "except socket.timeout as
e:" then raise the LLMClientError with "from e". Ensure this is applied to the
handlers that build LLMError (the ones referencing e.code, the
urllib.error.URLError handler, and the socket.timeout handler) so the original
exception is chained to the new LLMClientError.
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 326-332: The mock_stream_gen generator never emits a ('tool_use',
...) tuple so the new branch that should produce a second "thinking" event is
not exercised; update mock_stream_gen to yield the same text chunks but inject a
('tool_use', <tool payload>) tuple between chunk yields (e.g., after the first
or second chunk) before yielding the final (self.mock_response, []) tuple so the
test for the SSE tool_use streaming path and the extra thinking event is
exercised; refer to mock_stream_gen and the existing yields to locate where to
insert the ('tool_use', ...) emission.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 99402f45-993b-4c75-a066-600d0fd81ef5
📒 Files selected for processing (5)
web/pgadmin/llm/providers/anthropic.pyweb/pgadmin/llm/providers/docker.pyweb/pgadmin/llm/providers/openai.pyweb/pgadmin/tools/sqleditor/__init__.pyweb/pgadmin/tools/sqleditor/tests/test_nlq_chat.py
04b846c to
82a9bc6
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
259-261:⚠️ Potential issue | 🟡 MinorKeep the test extractor in sync with the endpoint.
The helper here still joins raw blocks, so the multi-block case reintroduces the double-semicolon output and the pipeline is already failing on it. Mirror the production cleanup before joining.
🧪 Suggested fix
- sql = ';\n\n'.join( - block.strip() for block in sql_blocks - ) if sql_blocks else None + sql = ';\n\n'.join( + block.strip().rstrip(';') for block in sql_blocks + ) if sql_blocks else None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 259 - 261, The test helper constructs sql from sql_blocks but doesn't apply the production cleanup, causing double-semicolons; update the logic that builds sql (the sql variable from sql_blocks) to mirror the endpoint's cleanup: trim each block, remove any trailing semicolons/empty blocks (e.g., strip() then rstrip(';') and skip empties), then join the cleaned blocks with ';\n\n' (or call the shared cleanup helper if one exists) so the multi-block output matches production.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/pgadmin/llm/chat.py`:
- Around line 166-179: The current generator emits ambiguous 2-tuples for both
tool events and final completions which causes misclassification (e.g., in
nlq_chat_stream); change the event shapes so they are distinct: emit a
clearly-typed tool event like ('tool_use', tool_events_list) and a distinct
completion event like ('complete', final_text, updated_messages) (or use a
simple Event namedtuple/dataclass) wherever the generator yields tool events and
the final result (references: the generator function in web/pgadmin/llm/chat.py
that yields tool-use tuples and the nlq_chat_stream consumer); update consumers
to match the new 3-tuple completion shape and handle the explicit 'tool_use' and
'complete' tags accordingly.
In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2896-2903: The call to chat_with_database_stream currently omits
the conversation history so follow-up NLQ turns lose context; update the call to
pass the request's history into the conversation_history parameter (e.g.,
conversation_history=request.history or the existing history variable in scope)
alongside user_message, sid, did, and system_prompt so chat_with_database_stream
receives prior messages for proper context.
- Around line 2924-2931: The regex used in the re.findall call that populates
sql_blocks is case-sensitive and misses fenced blocks with uppercase fence tags;
update the re.findall invocation in __init__.py (the call that assigns
sql_blocks) to add the re.IGNORECASE flag alongside re.DOTALL so the pattern
(?:sql|pgsql|postgresql)\s*\n(.*?)``` matches tags like SQL/POSTGRESQL as well,
then leave the subsequent sql construction (the sql variable) unchanged.
In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`:
- Around line 874-880: The current streaming branch inside the component uses
streamingTextRef.current and setMessages to replace the assistant message by
appending '\n\n(Generation stopped)' to the partial markdown, which can end up
inside an open fenced code block; instead preserve the partial response exactly
(do not append the stop notice to streamingTextRef.current) and call setMessages
twice (or add two entries) so you first update/commit the assistant message with
content: streamingTextRef.current and type MESSAGE_TYPES.ASSISTANT (filtering
out thinkingId and streamId as before), then append a separate assistant-status
message whose content is gettext('(Generation stopped)') (or use a distinct type
if you prefer) so the stop marker is rendered as a standalone message; apply the
same pattern to the analogous block handling stream end around
streamId/thinkingId (the other occurrence at the 905-910 region).
- Around line 552-556: The MarkdownContent is being forced into a span
(component="span") while renderMarkdownText(content) can output block-level
HTML, causing invalid DOM; change MarkdownContent in NLQChatPanel.jsx to render
as a block-level container (remove or set component to a block element) so block
nodes (paragraphs, lists, tables) are not injected into a span, and render the
streaming cursor separately (keep the Box wrapper and the cursor rendering logic
outside/adjacent to MarkdownContent) so streaming markdown displays correctly.
- Around line 1035-1043: Snapshot streamingIdRef.current into a local (e.g.,
const streamingId = streamingIdRef.current) before calling setMessages so the
functional updater filters using that captured id (alongside thinkingId) instead
of reading streamingIdRef.current later; then update setMessages to filter out
that local streamingId and add the error message (MESSAGE_TYPES.ERROR), and only
after setMessages clear streamingTextRef.current and set streamingIdRef.current
= null to keep behavior consistent with the abort/complete paths.
---
Duplicate comments:
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 259-261: The test helper constructs sql from sql_blocks but
doesn't apply the production cleanup, causing double-semicolons; update the
logic that builds sql (the sql variable from sql_blocks) to mirror the
endpoint's cleanup: trim each block, remove any trailing semicolons/empty blocks
(e.g., strip() then rstrip(';') and skip empties), then join the cleaned blocks
with ';\n\n' (or call the shared cleanup helper if one exists) so the
multi-block output matches production.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 03a8611a-98cf-454d-94d9-14ef26b9bbe7
📒 Files selected for processing (14)
docs/en_US/release_notes_9_14.rstweb/pgadmin/llm/chat.pyweb/pgadmin/llm/client.pyweb/pgadmin/llm/prompts/nlq.pyweb/pgadmin/llm/providers/anthropic.pyweb/pgadmin/llm/providers/docker.pyweb/pgadmin/llm/providers/ollama.pyweb/pgadmin/llm/providers/openai.pyweb/pgadmin/static/js/Theme/dark.jsweb/pgadmin/static/js/Theme/high_contrast.jsweb/pgadmin/static/js/Theme/light.jsweb/pgadmin/tools/sqleditor/__init__.pyweb/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsxweb/pgadmin/tools/sqleditor/tests/test_nlq_chat.py
✅ Files skipped from review due to trivial changes (1)
- docs/en_US/release_notes_9_14.rst
🚧 Files skipped from review as they are similar to previous changes (3)
- web/pgadmin/static/js/Theme/high_contrast.js
- web/pgadmin/static/js/Theme/light.js
- web/pgadmin/llm/client.py
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
Outdated
Show resolved
Hide resolved
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
Show resolved
Hide resolved
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
Show resolved
Hide resolved
cbbc025 to
2307535
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
185-208:⚠️ Potential issue | 🟠 MajorAlign these fixtures with the production SQL normalizer.
web/pgadmin/tools/sqleditor/__init__.pynow joins fenced SQL blocks withblock.strip().rstrip(';'), but this helper still usesblock.strip(). That means the multi-block cases here assert a differentsqlstring than the endpoint emits, and the single-block expectations still keep a trailing semicolon that runtime no longer returns.🔧 Representative fix
- expected_sql='SELECT * FROM users;' + expected_sql='SELECT * FROM users' @@ sql = ';\n\n'.join( - block.strip() for block in sql_blocks + block.strip().rstrip(';') for block in sql_blocks ) if sql_blocks else NoneUpdate the other single-block expectations the same way.
Also applies to: 253-261
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 185 - 208, Update the test fixtures in test_nlq_chat.py so expected_sql matches the production normalizer (which calls block.strip().rstrip(';')): for the 'SQL Extraction - Single SQL block' and other single-block cases remove the trailing semicolon from expected_sql, and for multi-block cases like 'SQL Extraction - Multiple SQL blocks' remove trailing semicolons from each fenced block so the joined expectation is "block1_without_trailing_semicolon\n\nblock2_without_trailing_semicolon". Apply the same change to the other similar fixtures (e.g., the pgsql/postgresql cases and the duplicated fixtures mentioned) so all expected_sql values reflect block.strip().rstrip(';') normalization.
🧹 Nitpick comments (2)
web/pgadmin/llm/providers/anthropic.py (2)
320-324: Improve exception chaining for better debugging.When re-raising exceptions, use
from eto preserve the exception chain and provide better tracebacks for debugging.♻️ Proposed fix
except Exception as e: raise LLMClientError(LLMError( - message=f"Streaming request failed: {str(e)}", + message=f"Streaming request failed: {e}", provider=self.provider_name - )) + )) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/anthropic.py` around lines 320 - 324, The except block in the streaming path currently raises a new LLMClientError/LLMError without exception chaining, losing the original traceback; update the raise in the except Exception as e block inside anthropic.py to re-raise the LLMClientError using "from e" so the original exception is preserved (i.e., raise LLMClientError(LLMError(...)) from e), referencing the existing LLMClientError, LLMError and self.provider_name symbols.
356-374: Add exception chaining for better error traceability.Multiple exception handlers re-raise without chaining, losing the original traceback context.
♻️ Proposed fix
raise LLMClientError(LLMError( message=error_msg, code=str(e.code), provider=self.provider_name, retryable=e.code in (429, 500, 502, 503, 504) - )) + )) from e except urllib.error.URLError as e: raise LLMClientError(LLMError( message=f"Connection error: {e.reason}", provider=self.provider_name, retryable=True - )) + )) from e except socket.timeout: raise LLMClientError(LLMError( message="Request timed out.", code='timeout', provider=self.provider_name, retryable=True - )) + )) from None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/anthropic.py` around lines 356 - 374, The exception handlers that raise LLMClientError currently drop the original traceback; update each except to chain the original exception using "raise ... from e" (bind socket.timeout as e in "except socket.timeout as e:") so the original exception context is preserved; specifically modify the handlers that construct LLMError and raise LLMClientError (the urllib.error.URLError block, the socket.timeout block, and any nearby blocks that use the variable e) to use exception chaining, keeping provider=self.provider_name and retryable flags unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 336-341: The request constructor references an undefined global
API_URL causing a NameError; update the code that builds the request (the
urllib.request.Request call inside the method that sends Anthropic requests) to
use the instance attribute self._api_url (the attribute set in __init__ of the
Anthropic provider class) instead of API_URL, and scan for any other occurrences
of API_URL in this class to replace with self._api_url so the instance
configuration is used.
- Around line 330-334: The headers dict in the method that builds requests (the
block creating headers with 'Content-Type', 'x-api-key', and
'anthropic-version') always injects x-api-key which breaks custom endpoints;
update that method so it only adds the 'x-api-key' header when self._api_key is
present (truthy), mirroring the behavior in _make_request(), so custom endpoints
without auth are not sent an API key.
In `@web/pgadmin/llm/providers/ollama.py`:
- Around line 421-449: The code currently sets final metadata only when
data.get('done') is true but always yields a final LLMResponse; change this so
the final LLMResponse is only built and yielded when a terminal frame was seen
(i.e., final_data is set or data.get('done') is True). In practice, wrap the
block that computes stop_reason, model_name/input/output tokens and yields
LLMResponse (the yield of LLMResponse with content, tool_calls, stop_reason,
model, usage, raw_response) in a conditional that verifies final_data/done is
present; if no terminal done frame was received, do not emit that “successful”
final response (instead return/stop without yielding it).
- Around line 332-342: OllamaClient.__init__ currently accepts any URL scheme
which can lead to unsafe local/resource access when urllib.request.urlopen is
called; add scheme validation in OllamaClient.__init__ that parses the provided
api_url (self._api_url) and only allows safe schemes (e.g., "http" and "https")
following the same pattern used in DockerClient, and raise a clear ValueError if
the scheme is missing or not allowed; ensure this check runs before any use of
urllib.request.urlopen in methods that call the API (e.g., where url is
constructed for POST in the request flow).
In `@web/pgadmin/llm/providers/openai.py`:
- Around line 405-415: The _process_stream() function is using an undefined
API_URL and always sets an Authorization header, which causes NameError and
prevents custom endpoints without an API key from working; update
_process_stream() to use the same request URL variable the class uses elsewhere
(the instance's base URL/endpoint field used by _make_request(), e.g.,
self._base_url or the same self attribute that _make_request() composes) and
build headers the same way as _make_request() by only adding 'Authorization:
Bearer {self._api_key}' when self._api_key is present/non-empty, otherwise omit
the header so custom endpoints without keys are supported.
In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2935-2972: The completion path currently emits only response_text
and a synthetic new_conversation_id without persisting the provided messages
from chat_with_database_stream, so implement persisting or round-tripping there:
when new_conversation_id is assigned (the block setting new_conversation_id)
update the server-side conversation store keyed by that ID (or create one if
missing) with the current messages list passed into this handler, or
alternatively include a serialized messages payload in the dict passed to
_nlq_sse_event under a key like 'history' so the client can round-trip it;
locate the logic around chat_with_database_stream, conversation_id, messages and
the yield _nlq_sse_event call and add the persistence or payload insertion
accordingly.
---
Duplicate comments:
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 185-208: Update the test fixtures in test_nlq_chat.py so
expected_sql matches the production normalizer (which calls
block.strip().rstrip(';')): for the 'SQL Extraction - Single SQL block' and
other single-block cases remove the trailing semicolon from expected_sql, and
for multi-block cases like 'SQL Extraction - Multiple SQL blocks' remove
trailing semicolons from each fenced block so the joined expectation is
"block1_without_trailing_semicolon\n\nblock2_without_trailing_semicolon". Apply
the same change to the other similar fixtures (e.g., the pgsql/postgresql cases
and the duplicated fixtures mentioned) so all expected_sql values reflect
block.strip().rstrip(';') normalization.
---
Nitpick comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 320-324: The except block in the streaming path currently raises a
new LLMClientError/LLMError without exception chaining, losing the original
traceback; update the raise in the except Exception as e block inside
anthropic.py to re-raise the LLMClientError using "from e" so the original
exception is preserved (i.e., raise LLMClientError(LLMError(...)) from e),
referencing the existing LLMClientError, LLMError and self.provider_name
symbols.
- Around line 356-374: The exception handlers that raise LLMClientError
currently drop the original traceback; update each except to chain the original
exception using "raise ... from e" (bind socket.timeout as e in "except
socket.timeout as e:") so the original exception context is preserved;
specifically modify the handlers that construct LLMError and raise
LLMClientError (the urllib.error.URLError block, the socket.timeout block, and
any nearby blocks that use the variable e) to use exception chaining, keeping
provider=self.provider_name and retryable flags unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8f4251b5-b9d7-4279-a5a7-eb7ffb65d1a8
📒 Files selected for processing (14)
docs/en_US/release_notes_9_14.rstweb/pgadmin/llm/chat.pyweb/pgadmin/llm/client.pyweb/pgadmin/llm/prompts/nlq.pyweb/pgadmin/llm/providers/anthropic.pyweb/pgadmin/llm/providers/docker.pyweb/pgadmin/llm/providers/ollama.pyweb/pgadmin/llm/providers/openai.pyweb/pgadmin/static/js/Theme/dark.jsweb/pgadmin/static/js/Theme/high_contrast.jsweb/pgadmin/static/js/Theme/light.jsweb/pgadmin/tools/sqleditor/__init__.pyweb/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsxweb/pgadmin/tools/sqleditor/tests/test_nlq_chat.py
🚧 Files skipped from review as they are similar to previous changes (4)
- web/pgadmin/llm/prompts/nlq.py
- web/pgadmin/static/js/Theme/light.js
- web/pgadmin/llm/client.py
- docs/en_US/release_notes_9_14.rst
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx (1)
799-824:⚠️ Potential issue | 🟠 MajorKeep the timeout and abort controller alive until the SSE body is fully consumed.
These lines run as soon as headers arrive, so the 5-minute timeout no longer protects the streaming read loop and
handleStop()loses theAbortControllerwhilereader.read()is still active. Clean them up in the terminal paths /finallyinstead.Suggested fix
- clearTimeout(timeoutId); - abortControllerRef.current = null; - if (!response.ok) { const errorData = await response.json().catch(() => ({})); throw new Error(errorData.errormsg || `HTTP error! status: ${response.status}`); } @@ } catch (error) { - clearTimeout(timeoutId); - abortControllerRef.current = null; readerRef.current = null; const streamId = streamingIdRef.current; @@ } finally { + clearTimeout(timeoutId); + abortControllerRef.current = null; setIsLoading(false); setThinkingMessageId(null); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx` around lines 799 - 824, The current code clears timeout and nulls abortControllerRef as soon as fetch returns headers, which lets handleStop lose the AbortController while the streaming reader (reader.read() loop) is still active; instead, keep controller and timeoutId alive until the SSE body is fully consumed and only call clearTimeout(timeoutId) and set abortControllerRef.current = null in the terminal paths (after the streaming read loop finishes or in a finally block) so handleStop can still abort the ongoing reader; update the logic around the fetch response handling and reader loop in NLQChatPanel.jsx to perform cleanup at the end of the stream and not immediately after fetch returns.web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
129-139:⚠️ Potential issue | 🟠 MajorMake the error scenarios assert the error payload unconditionally.
A non-JSON or otherwise unexpected error response still passes here because the assertions only run inside the guarded
if. Onceself.expected_erroris true, this test should fail as soon as the endpoint stops returning the expected JSON error shape.Suggested fix
if self.expected_error: - # For error cases, we expect JSON response - if response.status_code == 200 and \ - response.content_type == 'application/json': - data = json.loads(response.data) - self.assertFalse(data.get('success', True)) - if hasattr(self, 'error_contains'): - self.assertIn( - self.error_contains, - data.get('errormsg', '') - ) + self.assertEqual(response.mimetype, 'application/json') + data = json.loads(response.data) + self.assertFalse(data.get('success', True)) + if hasattr(self, 'error_contains'): + self.assertIn( + self.error_contains, + data.get('errormsg', '') + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 129 - 139, When self.expected_error is true, don't conditionally skip assertions based on response.content_type/status; instead assert that the response is JSON and has the expected error shape unconditionally. In the test block guarded by self.expected_error in test_nlq_chat.py, replace the inner "if response.status_code == 200 and response.content_type == 'application/json':" guard with explicit assertions (e.g., assert response.content_type == 'application/json' and assert response.status_code == 200 or another expected error code), then json.loads(response.data) and assert data.get('success') is False and, if hasattr(self, 'error_contains'), assert self.error_contains is in data.get('errormsg', '') so any non-JSON or malformed error response will cause the test to fail.
♻️ Duplicate comments (6)
web/pgadmin/llm/providers/anthropic.py (1)
330-341:⚠️ Potential issue | 🔴 CriticalUse
self._api_urland mirror_make_request()'s auth handling here.
API_URLis undefined, so every Anthropic streaming request fails before it is sent. This branch also always addsx-api-key, which breaks the custom endpoints that the non-streaming path already supports without auth.🐛 Proposed fix
headers = { 'Content-Type': 'application/json', - 'x-api-key': self._api_key, 'anthropic-version': API_VERSION } + if self._api_key: + headers['x-api-key'] = self._api_key request = urllib.request.Request( - API_URL, + self._api_url, data=json.dumps(payload).encode('utf-8'), headers=headers, method='POST' )Run this to compare the sync and streaming request builders:
#!/bin/bash sed -n '200,215p;330,341p' web/pgadmin/llm/providers/anthropic.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/anthropic.py` around lines 330 - 341, The streaming request builder is using an undefined API_URL and always injects 'x-api-key', causing failures and breaking custom endpoints; update the block constructing the streaming request (the Request creation near the top of the streaming path) to use self._api_url instead of API_URL and mirror the authentication/header logic used in _make_request(): build headers the same way (only include 'x-api-key' when appropriate for default Anthropic endpoints and preserve custom endpoint behavior), and ensure 'anthropic-version' and 'Content-Type' are set consistently with _make_request() so streaming and non-streaming requests behave identically.web/pgadmin/tools/sqleditor/__init__.py (1)
2935-2938:⚠️ Potential issue | 🟠 MajorDon't drop the updated hidden history on completion.
chat_with_database_stream()returns the updatedmessagesin its'complete'tuple, but this handler only keepsitem[1]and then emits a syntheticconversation_id. Without persisting or round-trippingitem[2], follow-up turns lose tool calls/results and the returned ID cannot actually resume the full conversation state.Also applies to: 2960-2971
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/__init__.py` around lines 2935 - 2938, The completion branch currently extracts only response_text from the ('complete', response_text, messages) tuple and emits a synthetic conversation_id, which drops the updated hidden history (item[2]) and prevents resuming tool state; update the handler in chat_with_database_stream's consumer to capture item[2] (messages) and persist or round-trip it with the emitted completion (e.g., include messages in the persisted conversation record or return it alongside conversation_id), ensuring the updated messages/hidden history are stored or returned so follow-up turns can resume tool calls/results instead of losing state.web/pgadmin/llm/providers/openai.py (1)
405-415:⚠️ Potential issue | 🔴 CriticalUse
self._api_urland keep auth handling consistent with_make_request().
API_URLis undefined, so every OpenAI streaming request fails immediately. This branch also always sendsAuthorization, which breaks the custom endpoints thatchat()already supports without an API key.🐛 Proposed fix
headers = { 'Content-Type': 'application/json', - 'Authorization': f'Bearer {self._api_key}' } + if self._api_key: + headers['Authorization'] = f'Bearer {self._api_key}' request = urllib.request.Request( - API_URL, + self._api_url, data=json.dumps(payload).encode('utf-8'), headers=headers, method='POST' )Run this to compare the sync and streaming request builders:
#!/bin/bash sed -n '208,222p;405,415p' web/pgadmin/llm/providers/openai.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/openai.py` around lines 405 - 415, The streaming request is using an undefined API_URL and always adds an Authorization header, breaking custom endpoints; update the Request builder to use self._api_url (not API_URL) and mirror the auth handling used in _make_request(): only include the Authorization header when self._api_key is present (or follow whatever header logic _make_request() uses), and ensure Content-Type remains application/json; also reuse any additional headers or auth-related logic from _make_request() so chat() and the streaming path behave consistently.web/pgadmin/llm/providers/ollama.py (2)
39-48:⚠️ Potential issue | 🟠 MajorValidate
api_urlbefore storing it.Every request path later feeds
self._api_urlintourllib.request.urlopen(), sofile:and other unexpected schemes are still accepted here. Restrict this tohttp/httpsup front.Suggested fix
import json import urllib.request import urllib.error +from urllib.parse import urlparse from collections.abc import Generator from typing import Optional, Union import uuid @@ """ self._api_url = api_url.rstrip('/') + parsed = urlparse(self._api_url) + if parsed.scheme not in ('http', 'https'): + raise ValueError( + 'Invalid Ollama API URL scheme. Only http/https are allowed.' + ) self._model = model or DEFAULT_MODEL🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/ollama.py` around lines 39 - 48, In Ollama client __init__, validate the provided api_url before storing: parse api_url (e.g., using urllib.parse.urlparse), ensure scheme is either "http" or "https" and raise a ValueError if not, then normalize (rstrip('/')) and assign to self._api_url; leave model handling (self._model = model or DEFAULT_MODEL) unchanged.
421-449:⚠️ Potential issue | 🟠 MajorOnly emit the final
LLMResponseafter a terminaldoneframe.If the NDJSON stream drops early,
final_datastaysNonebut this still yields a "successful" response with truncated content. Fail fast instead of synthesizing completion.Suggested fix
if data.get('done'): final_data = data done_reason = data.get('done_reason', '') model_name = data.get('model', self._model) input_tokens = data.get('prompt_eval_count', 0) output_tokens = data.get('eval_count', 0) + if final_data is None: + raise LLMClientError(LLMError( + message='Ollama stream ended before a terminal done frame', + provider=self.provider_name, + retryable=True + )) + # Build final response if tool_calls: stop_reason = StopReason.TOOL_USE🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/llm/providers/ollama.py` around lines 421 - 449, The code currently yields an LLMResponse even when final_data is None (stream dropped); change the logic in the generator so you only construct and yield the LLMResponse when a terminal frame is observed (i.e., when data.get('done') sets final_data and done_reason); if final_data is still None after the read loop, raise an error or return without emitting a synthesized completion instead of yielding using default values—update the block that builds stop_reason and yields LLMResponse to first check final_data (and/or done_reason) and fail fast if the stream ended prematurely, referencing final_data, data.get('done'), done_reason, content_parts, tool_calls and LLMResponse.web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
259-261:⚠️ Potential issue | 🟠 MajorKeep the test extractor aligned with the endpoint's semicolon normalization.
The backend strips trailing semicolons before joining fenced SQL blocks, but this helper still joins
block.strip()verbatim. The multi-block case will produce;;here and drift from the real extraction path.Suggested fix
- sql = ';\n\n'.join( - block.strip() for block in sql_blocks - ) if sql_blocks else None + sql = ';\n\n'.join( + block.strip().rstrip(';') for block in sql_blocks + ) if sql_blocks else None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 259 - 261, The test builds the joined SQL with "';\\n\\n'.join(block.strip() for block in sql_blocks)" but the backend first strips trailing semicolons from each fenced block, so update the join to remove trailing semicolons from each block before trimming and joining (i.e., for each item in sql_blocks, remove trailing ';' then strip) so the test extractor matches the endpoint's semicolon normalization when producing the "sql" value from sql_blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/pgadmin/llm/chat.py`:
- Around line 157-166: The declared return type of chat_with_database_stream is
inaccurate: update its annotation to reflect the three yielded event shapes —
text chunks (str), completion events ('complete', response.content: str,
messages: list[Message]), and tool-use events ('tool_use', list[str]) — e.g.
Generator[Union[str, Tuple[str, str, List[Message]], Tuple[str, List[str]]],
None, None]; import or reference Tuple and List from typing as needed and ensure
the tuple element types match the actual yields in chat_with_database_stream.
In `@web/pgadmin/llm/providers/openai.py`:
- Around line 378-385: In chat_stream(), the payload always sets 'temperature'
which diverges from chat()'s compatibility behavior; update chat_stream() (in
the OpenAI provider class, method chat_stream) to only add the 'temperature' key
to the payload when temperature is non-default (e.g., temperature != 0.0 or when
explicitly provided), mirroring chat()'s conditional inclusion so endpoints that
don't accept temperature won't receive it.
In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`:
- Around line 997-1012: The handler currently places raw JSON/explanation into
content which prevents ChatMessage from falling back to message.sql; update the
'sql'/'complete' branch so when there is event.sql but no fenced code blocks you
extract an explanation from event.content (attempt JSON.parse and use
parsed.explanation, or use raw event.content on parse failure), then set content
= '' and include explanation on the MESSAGE_TYPES.SQL object (keep sql:
event.sql) so ChatMessage can render the legacy SQL preview/actions; reference
the event.sql/content variables and the MESSAGE_TYPES.SQL message object to
implement this change.
---
Outside diff comments:
In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`:
- Around line 799-824: The current code clears timeout and nulls
abortControllerRef as soon as fetch returns headers, which lets handleStop lose
the AbortController while the streaming reader (reader.read() loop) is still
active; instead, keep controller and timeoutId alive until the SSE body is fully
consumed and only call clearTimeout(timeoutId) and set
abortControllerRef.current = null in the terminal paths (after the streaming
read loop finishes or in a finally block) so handleStop can still abort the
ongoing reader; update the logic around the fetch response handling and reader
loop in NLQChatPanel.jsx to perform cleanup at the end of the stream and not
immediately after fetch returns.
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 129-139: When self.expected_error is true, don't conditionally
skip assertions based on response.content_type/status; instead assert that the
response is JSON and has the expected error shape unconditionally. In the test
block guarded by self.expected_error in test_nlq_chat.py, replace the inner "if
response.status_code == 200 and response.content_type == 'application/json':"
guard with explicit assertions (e.g., assert response.content_type ==
'application/json' and assert response.status_code == 200 or another expected
error code), then json.loads(response.data) and assert data.get('success') is
False and, if hasattr(self, 'error_contains'), assert self.error_contains is in
data.get('errormsg', '') so any non-JSON or malformed error response will cause
the test to fail.
---
Duplicate comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 330-341: The streaming request builder is using an undefined
API_URL and always injects 'x-api-key', causing failures and breaking custom
endpoints; update the block constructing the streaming request (the Request
creation near the top of the streaming path) to use self._api_url instead of
API_URL and mirror the authentication/header logic used in _make_request():
build headers the same way (only include 'x-api-key' when appropriate for
default Anthropic endpoints and preserve custom endpoint behavior), and ensure
'anthropic-version' and 'Content-Type' are set consistently with _make_request()
so streaming and non-streaming requests behave identically.
In `@web/pgadmin/llm/providers/ollama.py`:
- Around line 39-48: In Ollama client __init__, validate the provided api_url
before storing: parse api_url (e.g., using urllib.parse.urlparse), ensure scheme
is either "http" or "https" and raise a ValueError if not, then normalize
(rstrip('/')) and assign to self._api_url; leave model handling (self._model =
model or DEFAULT_MODEL) unchanged.
- Around line 421-449: The code currently yields an LLMResponse even when
final_data is None (stream dropped); change the logic in the generator so you
only construct and yield the LLMResponse when a terminal frame is observed
(i.e., when data.get('done') sets final_data and done_reason); if final_data is
still None after the read loop, raise an error or return without emitting a
synthesized completion instead of yielding using default values—update the block
that builds stop_reason and yields LLMResponse to first check final_data (and/or
done_reason) and fail fast if the stream ended prematurely, referencing
final_data, data.get('done'), done_reason, content_parts, tool_calls and
LLMResponse.
In `@web/pgadmin/llm/providers/openai.py`:
- Around line 405-415: The streaming request is using an undefined API_URL and
always adds an Authorization header, breaking custom endpoints; update the
Request builder to use self._api_url (not API_URL) and mirror the auth handling
used in _make_request(): only include the Authorization header when
self._api_key is present (or follow whatever header logic _make_request() uses),
and ensure Content-Type remains application/json; also reuse any additional
headers or auth-related logic from _make_request() so chat() and the streaming
path behave consistently.
In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2935-2938: The completion branch currently extracts only
response_text from the ('complete', response_text, messages) tuple and emits a
synthetic conversation_id, which drops the updated hidden history (item[2]) and
prevents resuming tool state; update the handler in chat_with_database_stream's
consumer to capture item[2] (messages) and persist or round-trip it with the
emitted completion (e.g., include messages in the persisted conversation record
or return it alongside conversation_id), ensuring the updated messages/hidden
history are stored or returned so follow-up turns can resume tool calls/results
instead of losing state.
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 259-261: The test builds the joined SQL with
"';\\n\\n'.join(block.strip() for block in sql_blocks)" but the backend first
strips trailing semicolons from each fenced block, so update the join to remove
trailing semicolons from each block before trimming and joining (i.e., for each
item in sql_blocks, remove trailing ';' then strip) so the test extractor
matches the endpoint's semicolon normalization when producing the "sql" value
from sql_blocks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cb85a712-2d91-4868-96c6-185e0724dba0
📒 Files selected for processing (14)
docs/en_US/release_notes_9_14.rstweb/pgadmin/llm/chat.pyweb/pgadmin/llm/client.pyweb/pgadmin/llm/prompts/nlq.pyweb/pgadmin/llm/providers/anthropic.pyweb/pgadmin/llm/providers/docker.pyweb/pgadmin/llm/providers/ollama.pyweb/pgadmin/llm/providers/openai.pyweb/pgadmin/static/js/Theme/dark.jsweb/pgadmin/static/js/Theme/high_contrast.jsweb/pgadmin/static/js/Theme/light.jsweb/pgadmin/tools/sqleditor/__init__.pyweb/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsxweb/pgadmin/tools/sqleditor/tests/test_nlq_chat.py
🚧 Files skipped from review as they are similar to previous changes (3)
- web/pgadmin/static/js/Theme/light.js
- web/pgadmin/static/js/Theme/dark.js
- web/pgadmin/static/js/Theme/high_contrast.js
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
Show resolved
Hide resolved
…y in the AI Assistant. Fixes pgadmin-org#9734
- Anthropic: preserve separators between text blocks in streaming to match _parse_response() behavior. - Docker: validate that the API URL points to a loopback address to constrain the request surface. - Docker/OpenAI: raise LLMClientError on empty streams instead of yielding blank LLMResponse objects, matching non-streaming behavior. - SQL extraction: strip trailing semicolons before joining blocks to avoid double semicolons in output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ing.
- Use distinct 3-tuple ('complete', text, messages) for completion events
to avoid ambiguity with ('tool_use', [...]) 2-tuples in chat streaming.
- Pass conversation history from request into chat_with_database_stream()
so follow-up NLQ turns retain context.
- Add re.IGNORECASE to SQL fence regex for case-insensitive matching.
- Render MarkdownContent as block element instead of span to avoid
invalid DOM when response contains paragraphs, lists, or tables.
- Keep stop notice as a separate message instead of appending to partial
markdown, preventing it from being swallowed by open code fences.
- Snapshot streamingIdRef before setMessages in error handler to avoid
race condition where ref is cleared before React executes the updater.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix critical NameError: use self._api_url instead of undefined API_URL in anthropic and openai streaming _process_stream() methods. - Match sync path auth handling: conditionally set API key headers in streaming paths for both anthropic and openai providers. - Remove unconditional temperature from openai streaming payload to match sync path compatibility approach. - Add URL scheme validation to OllamaClient.__init__ to prevent unsafe local/resource access via non-http schemes. - Guard ollama streaming finalizer: raise error when stream drops without a done frame and no content was received. - Update chat.py type hint and docstring for 3-tuple completion event. - Serialize and return filtered conversation history in the complete SSE event so the client can round-trip it on follow-up turns. - Store and send conversation history from NLQChatPanel, clear on conversation reset. - Fix JSON-fallback SQL render path: clear content when SQL was extracted without fenced blocks so ChatMessage uses sql-only renderer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adding block scoping to the error case introduced an unmatched brace that prevented the switch statement from closing properly, causing an eslint parse error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace compaction module imports with inline history deserialization
and filtering since compaction.py is on a different branch.
- Add rstrip(';') to SQL extraction test to match production code,
fixing double-semicolon assertion failure.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The rstrip(';') applied to each block before joining means single
blocks and the last block in multi-block joins no longer have
trailing semicolons. Update expected values to match.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c880445 to
933a69e
Compare
Truncated content from a dropped connection should not be treated as a complete response, even if partial text was streamed. Always raise when final_data is None, matching CodeRabbit's recommendation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix #9734
Summary by CodeRabbit
New Features
Bug Fixes
Style
Documentation