Fix an issue where LLM responses are not streamed or rendered properly in the AI Assistant. Fixes #9734 by dpage · Pull Request #9735 · pgadmin-org/pgadmin4

dpage · 2026-03-11T14:26:20Z

Summary by CodeRabbit

New Features
- Real-time streaming for AI-assisted chat with live text chunks, thinking indicators, and final aggregated responses.
- Incremental Markdown rendering with fenced SQL code blocks, multi-statement support, per-block explanations, and SQL extraction/actions.
Bug Fixes
- Fixed issue preventing AI assistant responses from streaming or rendering correctly.
Style
- Added hyperlink color tokens to dark, light, and high-contrast themes.
Documentation
- Release notes updated to include the new bug-fix entry.

coderabbitai · 2026-03-11T14:27:12Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds end-to-end LLM response streaming: provider/client streaming APIs, a streaming chat pipeline in NLQ backend, SSE incremental events, frontend incremental Markdown/code rendering for streaming responses, tests for extraction/streaming, theme hyperlink tokens, and a release-note entry for issue #9734.

Changes

Cohort / File(s)	Summary
Core LLM API `web/pgadmin/llm/chat.py`, `web/pgadmin/llm/client.py`	Added `chat_with_database_stream` and `LLMClient.chat_stream` generator interfaces; streaming-typed yields and minor typing imports.
Provider implementations `web/pgadmin/llm/providers/openai.py`, `web/pgadmin/llm/providers/anthropic.py`, `web/pgadmin/llm/providers/docker.py`, `web/pgadmin/llm/providers/ollama.py`	Implemented `chat_stream`, `_process_stream`, and provider-specific stream readers (`_read_*_stream` / NDJSON / SSE); added streaming error handling, usage/tool_call aggregation; Docker adds loopback URL validation.
NLQ prompt `web/pgadmin/llm/prompts/nlq.py`	Replaced strict JSON-only final-answer instruction with fenced Markdown SQL response format in `NLQ_SYSTEM_PROMPT`.
NLQ backend SSE pipeline `web/pgadmin/tools/sqleditor/__init__.py`	Switched backend to `chat_with_database_stream`; emits SSE `text_delta`, `thinking`, and `complete` events; extracts SQL from fenced code blocks; adjusted final SSE payload shape.
NLQ frontend streaming UI `web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`	Added STREAMING message type, streaming state, incremental Markdown/code parsing and rendering, per-block actions, blinking cursor, SSE handlers for incremental events, and `formatSqlWithPrefs` prop.
Tests: streaming & extraction `web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`	Updated mocks to streaming generator; added tests for fenced-code SQL extraction, SSE streaming events, and prompt format expectations.
Theme tokens `web/pgadmin/static/js/Theme/light.js`, `web/pgadmin/static/js/Theme/dark.js`, `web/pgadmin/static/js/Theme/high_contrast.js`	Added `hyperlinkColor` token to `otherVars` across themes.
Docs / Release notes `docs/en_US/release_notes_9_14.rst`	Added bugfix entry for issue `#9734` about streamed LLM responses rendering.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant NLQ_UI as NLQ Chat UI
    participant Backend as NLQ Backend
    participant ChatPipe as Chat Pipeline
    participant Provider as LLM Provider Client
    participant LLM as LLM Service

    User->>NLQ_UI: Submit question
    NLQ_UI->>Backend: POST user message
    Backend->>ChatPipe: chat_with_database_stream(...)
    ChatPipe->>Provider: chat_stream(messages, tools, system_prompt)
    Provider->>LLM: HTTP streaming request
    LLM-->>Provider: Stream chunks (SSE/NDJSON)

    loop per chunk
        alt text chunk
            Provider-->>ChatPipe: yield "text" (string)
            ChatPipe-->>Backend: emit SSE {type: "text_delta", content}
            Backend-->>NLQ_UI: SSE text_delta
            NLQ_UI->>NLQ_UI: parseMarkdownSegments + render
        else tool call
            Provider-->>ChatPipe: yield ('tool_use', ...)
            ChatPipe-->>Backend: emit SSE {type: "thinking"}
            Backend-->>NLQ_UI: SSE thinking
        end
    end

    LLM-->>Provider: stream end + final response
    Provider-->>ChatPipe: yield final LLMResponse
    ChatPipe->>ChatPipe: extract SQL from fenced code blocks (or JSON fallback)
    ChatPipe-->>Backend: final tuple (response_text, history)
    Backend-->>NLQ_UI: SSE {type: "complete", content, sql}
    NLQ_UI->>User: render final content + SQL

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Add AI functionality to pgAdmin #9472 — Related changes to streaming interfaces and provider streaming implementations in the LLM subsystem.
Fix model tool prompt #9730 — Modifies NLQ_SYSTEM_PROMPT in web/pgadmin/llm/prompts/nlq.py, closely related to prompt-format changes.

Suggested reviewers

anilsahoo20
akshay-joshi

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 79.55% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: fixing LLM response streaming and rendering in the AI Assistant, which is the primary focus of this comprehensive PR.
Linked Issues check	✅ Passed	The PR comprehensively addresses all requirements: implements complete SSE streaming via chat_with_database_stream, adds streaming support across all LLM providers, and enables proper rendering of streamed content with Markdown parsing and code block re-rendering.
Out of Scope Changes check	✅ Passed	All changes are tightly scoped to streaming/rendering: LLM client/provider streaming implementations, NLQ prompt format updates, UI streaming handlers, SQL extraction logic, and theme color additions for UI rendering—all supporting the core streaming objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (3)

web/pgadmin/llm/providers/ollama.py (2)

321-329: Preserve exception context with raise ... from e.

The exception re-raise loses the original traceback. Use from e (or from None if you intentionally want to suppress the original) to maintain the exception chain for easier debugging.

♻️ Proposed fix

         try:
             yield from self._process_stream(payload)
         except LLMClientError:
             raise
         except Exception as e:
             raise LLMClientError(LLMError(
-                message=f"Streaming request failed: {str(e)}",
+                message=f"Streaming request failed: {e!s}",
                 provider=self.provider_name
-            ))
+            )) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/ollama.py` around lines 321 - 329, The exception
handler in the streaming path currently raises a new LLMClientError losing the
original traceback; update the except Exception as e block to re-raise the
constructed LLMClientError using exception chaining (raise ... from e) so the
original exception context is preserved; apply this change where the yield from
self._process_stream(payload) is wrapped and the LLMClientError(LLMError(...,
provider=self.provider_name)) is created so the new raise uses "from e" to keep
the original traceback.

346-364: Preserve exception context in _process_stream error handlers.

Both HTTPError and URLError handlers lose the original exception context. Add from e to maintain the traceback chain.

♻️ Proposed fix

         except urllib.error.HTTPError as e:
             error_body = e.read().decode('utf-8')
             try:
                 error_data = json.loads(error_body)
                 error_msg = error_data.get('error', str(e))
             except json.JSONDecodeError:
                 error_msg = error_body or str(e)
             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Cannot connect to Ollama: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/ollama.py` around lines 346 - 364, In
_process_stream, the except handlers for urllib.error.HTTPError and
urllib.error.URLError (the blocks that build and raise LLMClientError wrapping
LLMError) are losing the original exception context; update both raises to use
exception chaining (raise LLMClientError(...) from e) so the original traceback
is preserved — specifically modify the HTTPError handler that parses
error_body/error_data and raises LLMClientError(LLMError(...)) and the URLError
handler that raises LLMClientError(LLMError(...)) to append "from e".

web/pgadmin/llm/chat.py (1)

157-166: Generator return type annotation could be more precise.

The type annotation Generator[Union[str, tuple[str, list[Message]]], None, None] doesn't capture the ('tool_use', list[str]) tuple variant that's yielded at line 229. Consider using a more explicit union or documenting the intermediate tuple type.

💡 Suggested type improvement

+# Type alias for streaming yields
+StreamYield = Union[str, tuple[str, list[str]], tuple[str, list[Message]]]
+
 def chat_with_database_stream(
     user_message: str,
     sid: int,
     did: int,
     conversation_history: Optional[list[Message]] = None,
     system_prompt: Optional[str] = None,
     max_tool_iterations: Optional[int] = None,
     provider: Optional[str] = None,
     model: Optional[str] = None
-) -> Generator[Union[str, tuple[str, list[Message]]], None, None]:
+) -> Generator[StreamYield, None, None]:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/chat.py` around lines 157 - 166, The Generator return type
for chat_with_database_stream is too broad and doesn't include the ('tool_use',
list[str]) intermediate yield; update the annotation to explicitly enumerate all
yielded variants (e.g. include Tuple[Literal['tool_use'], List[str]] and the
existing Tuple[str, List[Message]] plus str) using typing.Tuple, typing.List and
typing.Literal (or a small Union alias) so the signature reads something like
Generator[Union[str, Tuple[Literal['tool_use'], List[str]], Tuple[str,
List[Message]]], None, None]; change only the annotation on
chat_with_database_stream and import any required typing names.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 422-457: The streaming handler inside _parse_response currently
concatenates every text_delta which can collapse boundaries between content
blocks; modify the event handling around event_type == 'content_block_start' /
'content_block_delta' / 'content_block_stop' to preserve block separators by
tracking when a new content block begins (use current_tool_block or a new flag
like in_block) and, if content_parts already contains data, emit a separator
(e.g., '\n\n' or the same separator used when the top-level response joins
blocks) before yielding subsequent text deltas; ensure this logic integrates
with the existing variables content_parts, tool_input_json and yield text so the
final LLMResponse reconstruction retains original block boundaries.

In `@web/pgadmin/llm/providers/docker.py`:
- Around line 535-566: The streaming finalizer currently always yields an
LLMResponse (the block constructing content, tool_calls, stop_reason and
yielding LLMResponse), which can convert stream failures into empty assistant
messages; change it to mirror _parse_response() safeguards by: after building
content, tool_calls (using the same json.loads with JSONDecodeError fallback as
used for ToolCall construction) and mapping finish_reason via stop_reason_map,
detect error/empty-stream conditions (no content and no tool_calls OR
finish_reason indicates an error/blank result) and raise LLMClientError instead
of yielding an empty LLMResponse; otherwise yield the LLMResponse as before,
keeping the same fields (content, tool_calls, stop_reason, model, usage).
- Around line 411-423: The code currently uses self._api_url directly for
server-side requests; add a shared helper (e.g., validate_loopback_api_url(url))
and call it from both the sync and streaming request paths before constructing
the urllib Request. The helper should parse the URL (urllib.parse.urlparse),
ensure scheme is 'http' or 'https', and ensure the netloc hostname is loopback
(127.0.0.1, ::1, or 'localhost' and their bracketed IPv6 forms) — otherwise
raise ValueError or return an error so the request is rejected; update the
places that use self._api_url (the streaming path and the sync path that builds
url = f'{self._api_url}/engines/v1/chat/completions') to call this helper and
fail-fast if validation fails.

In `@web/pgadmin/llm/providers/openai.py`:
- Around line 524-556: The streaming finalizer currently always yields an
LLMResponse even if the stream produced no usable output; mirror the safeguards
in _parse_response() by validating content_parts, tool_calls_data and
finish_reason before yielding: after building content = ''.join(content_parts)
and tool_calls from tool_calls_data, if both content is empty and tool_calls is
empty or the finish_reason indicates an error, raise the same exception
type/logic used by _parse_response() (instead of yielding) so callers receive an
error; use the same variables (content_parts, tool_calls_data, finish_reason,
model_name, usage) and only yield LLMResponse when those checks pass.

In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2929-2931: The current join that builds the variable sql from
sql_blocks uses "';\n\n'.join(block.strip() for block in sql_blocks)" which can
produce double-semicolons if blocks already end with ';'; update the logic that
constructs sql to first strip trailing semicolons and whitespace from each block
(e.g. normalize each block with something like
block.rstrip().rstrip(';').strip()) before joining, then join with a single
';\n\n' to ensure only a single semicolon separates blocks; modify the
assignment that computes sql (the sql and sql_blocks variables) accordingly.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 193-201: The SQL extraction is joining blocks that already include
trailing semicolons, causing duplicated semicolons; update the join that assigns
to the variable named sql (the expression iterating over sql_blocks) to remove
trailing semicolons from each block before joining (e.g., call strip() then
rstrip(';') on each block) so the final join uses cleaned blocks separated by
';\n\n'.

---

Nitpick comments:
In `@web/pgadmin/llm/chat.py`:
- Around line 157-166: The Generator return type for chat_with_database_stream
is too broad and doesn't include the ('tool_use', list[str]) intermediate yield;
update the annotation to explicitly enumerate all yielded variants (e.g. include
Tuple[Literal['tool_use'], List[str]] and the existing Tuple[str, List[Message]]
plus str) using typing.Tuple, typing.List and typing.Literal (or a small Union
alias) so the signature reads something like Generator[Union[str,
Tuple[Literal['tool_use'], List[str]], Tuple[str, List[Message]]], None, None];
change only the annotation on chat_with_database_stream and import any required
typing names.

In `@web/pgadmin/llm/providers/ollama.py`:
- Around line 321-329: The exception handler in the streaming path currently
raises a new LLMClientError losing the original traceback; update the except
Exception as e block to re-raise the constructed LLMClientError using exception
chaining (raise ... from e) so the original exception context is preserved;
apply this change where the yield from self._process_stream(payload) is wrapped
and the LLMClientError(LLMError(..., provider=self.provider_name)) is created so
the new raise uses "from e" to keep the original traceback.
- Around line 346-364: In _process_stream, the except handlers for
urllib.error.HTTPError and urllib.error.URLError (the blocks that build and
raise LLMClientError wrapping LLMError) are losing the original exception
context; update both raises to use exception chaining (raise LLMClientError(...)
from e) so the original traceback is preserved — specifically modify the
HTTPError handler that parses error_body/error_data and raises
LLMClientError(LLMError(...)) and the URLError handler that raises
LLMClientError(LLMError(...)) to append "from e".

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 45878a7f-70de-4cc0-82bd-4bb0c8b48eb3

📥 Commits

Reviewing files that changed from the base of the PR and between c8bd75c and fc0af6f.

📒 Files selected for processing (14)

docs/en_US/release_notes_9_14.rst
web/pgadmin/llm/chat.py
web/pgadmin/llm/client.py
web/pgadmin/llm/prompts/nlq.py
web/pgadmin/llm/providers/anthropic.py
web/pgadmin/llm/providers/docker.py
web/pgadmin/llm/providers/ollama.py
web/pgadmin/llm/providers/openai.py
web/pgadmin/static/js/Theme/dark.js
web/pgadmin/static/js/Theme/high_contrast.js
web/pgadmin/static/js/Theme/light.js
web/pgadmin/tools/sqleditor/__init__.py
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py

web/pgadmin/llm/providers/anthropic.py

web/pgadmin/llm/providers/docker.py

web/pgadmin/llm/providers/openai.py

web/pgadmin/tools/sqleditor/__init__.py

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

web/pgadmin/tools/sqleditor/__init__.py (1)
2875-2877: ⚠️ Potential issue | 🟠 Major

conversation_id is stateless right now.

chat_with_database_stream() returns (response_text, messages), but this handler drops messages and only echoes/generates an ID. Follow-up turns therefore cannot replay prior tool calls/results, so multi-turn NLQ chats will silently lose context. Persist the returned history by conversation_id, or return it to the client and feed it back into conversation_history on the next request.

Also applies to: 2919-2954
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/__init__.py` around lines 2875 - 2877, The
handler currently discards the messages returned by chat_with_database_stream()
and only emits a generated conversation_id, losing tool-call history; update the
request flow in the sqleditor handler to persist or round-trip the returned
conversation history: capture the (response_text, messages) tuple from
chat_with_database_stream(), then either (a) store messages keyed by
conversation_id in a persistent/in-memory store (so future requests can load
conversation_history by conversation_id) or (b) include the messages in the JSON
response to the client and accept conversation_history on the next request;
apply the same change to the identical logic block referenced around lines
2919-2954 so conversation_history is preserved across turns (use the symbols
conversation_id, conversation_history, and chat_with_database_stream to locate
and modify the code).

♻️ Duplicate comments (1)

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)

185-242: ⚠️ Potential issue | 🟠 Major

Mirror the production SQL normalization in this test.

This helper still uses block.strip(), while the endpoint now uses block.strip().rstrip(';'), which is why CI is hitting SELECT * FROM users;;. Update the join here to match production and drop the trailing-semicolon expectations for the single/final block cases as well.

🔧 Proposed fix

         ('SQL Extraction - Single SQL block', dict(
             response_text=(
                 'Here is the query:\n\n'
                 '```sql\nSELECT * FROM users;\n```\n\n'
                 'This returns all users.'
             ),
-            expected_sql='SELECT * FROM users;'
+            expected_sql='SELECT * FROM users'
         )),
@@
         ('SQL Extraction - pgsql language tag', dict(
             response_text='```pgsql\nSELECT 1;\n```',
-            expected_sql='SELECT 1;'
+            expected_sql='SELECT 1'
         )),
         ('SQL Extraction - postgresql language tag', dict(
             response_text='```postgresql\nSELECT 1;\n```',
-            expected_sql='SELECT 1;'
+            expected_sql='SELECT 1'
         )),
@@
             expected_sql=(
                 'SELECT u.name, o.total\n'
                 'FROM users u\n'
                 'JOIN orders o ON u.id = o.user_id\n'
-                'WHERE o.total > 100;'
+                'WHERE o.total > 100'
             )
         )),
@@
         sql = ';\n\n'.join(
-            block.strip() for block in sql_blocks
+            block.strip().rstrip(';') for block in sql_blocks
         ) if sql_blocks else None

Also applies to: 253-261

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 185 - 242,
The test expectations and the SQL-joining helper in test_nlq_chat.py must mirror
production normalization: change the join that builds sql from sql_blocks to
strip trailing semicolons (use block.strip().rstrip(';') when joining into the
variable sql) and update expected_sql strings for the cases 'SQL Extraction -
Single SQL block', 'SQL Extraction - pgsql language tag', 'SQL Extraction -
postgresql language tag' and the multiline case so they no longer include a
trailing semicolon (e.g. 'SELECT * FROM users' instead of 'SELECT * FROM
users;'); apply the same change to the other helper occurrence referenced around
lines 253-261.

🧹 Nitpick comments (11)

web/pgadmin/llm/providers/openai.py (3)

383-387: Preserve exception chain for debugging.

When re-raising exceptions, use raise ... from e to preserve the original traceback. This pattern appears multiple times in the streaming methods.

♻️ Suggested fix

         except Exception as e:
             raise LLMClientError(LLMError(
-                message=f"Streaming request failed: {str(e)}",
+                message=f"Streaming request failed: {e!s}",
                 provider=self.provider_name
-            ))
+            )) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/openai.py` around lines 383 - 387, The exception
handling in the OpenAI streaming code wraps errors into LLMClientError/LLMError
but drops the original traceback; update the except blocks (e.g. the handler
that currently does "except Exception as e: raise LLMClientError(LLMError(...))"
which references self.provider_name and constructs LLMError) to re-raise using
"raise LLMClientError(LLMError(...)) from e" so the original exception chain is
preserved; apply the same change to all similar streaming-method handlers in
this file that catch Exception and raise LLMClientError.

550-555: The empty response safeguard is implemented - consider aligning error detail with non-streaming path.

The check for empty responses now correctly raises an error, addressing the previous review concern. However, the non-streaming _parse_response provides more specific error messages based on finish_reason (e.g., detailed guidance when MAX_TOKENS is hit about model context limits). Consider mirroring that logic for consistency:

♻️ Optional: align error handling with _parse_response

         if not content and not tool_calls:
+            if stop_reason == StopReason.MAX_TOKENS:
+                raise LLMClientError(LLMError(
+                    message=f'Response truncated due to token limit '
+                            f'(input: {usage.input_tokens} tokens). '
+                            f'The request is too large for model '
+                            f'{self._model}. '
+                            f'Try using a model with a larger context '
+                            f'window, or analyze a smaller scope.',
+                    code='max_tokens',
+                    provider=self.provider_name,
+                    retryable=False
+                ))
+            elif finish_reason and finish_reason not in ('stop', 'tool_calls'):
+                raise LLMClientError(LLMError(
+                    message=f'Empty response with finish reason: {finish_reason}',
+                    code=finish_reason,
+                    provider=self.provider_name,
+                    retryable=False
+                ))
             raise LLMClientError(LLMError(
                 message='No response content returned from API',
                 provider=self.provider_name,
                 retryable=False
             ))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/openai.py` around lines 550 - 555, The
empty-response check currently raises a generic LLMClientError; update it to
mirror the non-streaming _parse_response logic by inspecting the response's
finish_reason (and any available model error metadata) before raising so you
emit the same detailed guidance (e.g., MAX_TOKENS / context-limit hints) and
include provider=self.provider_name and any diagnostic fields; modify the block
that raises LLMClientError (the conditional that checks content and tool_calls)
to build the LLMError message using the same finish_reason-driven messages and
details used in _parse_response so streaming and non-streaming paths produce
consistent errors.

418-436: Add exception chaining to preserve tracebacks.

Similar to the catch-all handler above, these exception handlers should chain the original exception.

♻️ Suggested fixes

         except urllib.error.HTTPError as e:
             # ... error parsing ...
             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Connection error: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
         except socket.timeout:
             raise LLMClientError(LLMError(
                 message="Request timed out.",
                 code='timeout',
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/openai.py` around lines 418 - 436, The exception
handlers that raise LLMClientError/LLMError should preserve original tracebacks
by using exception chaining: in the HTTP/URLError handlers add "raise
LLMClientError(... ) from e" (they already reference e), and change "except
socket.timeout:" to "except socket.timeout as e:" then raise the LLMClientError
with "from e". Ensure this is applied to the handlers that build LLMError (the
ones referencing e.code, the urllib.error.URLError handler, and the
socket.timeout handler) so the original exception is chained to the new
LLMClientError.

web/pgadmin/llm/providers/docker.py (5)

51-64: ValueError propagation may be inconsistent with caller expectations.

From the context in client.py, get_llm_client() instantiates DockerClient without catching ValueError. Other error paths in this module raise LLMClientError. A ValueError here will propagate as an unexpected exception type to higher-level handlers.

Consider wrapping the validation in __init__ to convert ValueError to LLMClientError, or catch it in get_llm_client() for consistency.

♻️ Option: Wrap in __init__ for consistency

         self._api_url = (api_url or DEFAULT_API_URL).rstrip('/')
-        _validate_loopback_url(self._api_url)
+        try:
+            _validate_loopback_url(self._api_url)
+        except ValueError as e:
+            raise LLMClientError(LLMError(
+                message=str(e),
+                provider='docker',
+                retryable=False
+            )) from e
         self._model = model or DEFAULT_MODEL

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/docker.py` around lines 51 - 64, The
_validate_loopback_url function currently raises ValueError which can leak to
callers; convert that into the module's LLMClientError before it escapes by
catching ValueError in the DockerClient initialization (DockerClient.__init__)
or by having get_llm_client() wrap the call that triggers _validate_loopback_url
in a try/except that catches ValueError and re-raises LLMClientError with a
clear message; ensure you reference _validate_loopback_url, DockerClient,
get_llm_client, and LLMClientError so callers always receive LLMClientError for
validation failures.

418-422: Add exception chaining for better debugging.

Use raise ... from e to preserve the original exception traceback, which helps with debugging.

♻️ Proposed fix

         except Exception as e:
             raise LLMClientError(LLMError(
-                message=f"Streaming request failed: {str(e)}",
+                message=f"Streaming request failed: {e!s}",
                 provider=self.provider_name
-            ))
+            )) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/docker.py` around lines 418 - 422, The except block
that currently raises LLMClientError(LLMError(...)) should preserve the original
exception traceback by using exception chaining; modify the raise to use "raise
LLMClientError(LLMError(message=f'Streaming request failed: {str(e)}',
provider=self.provider_name)) from e" so the original exception e is attached
for better debugging.

515-521: Simplify redundant key check.

The if 'usage' in data and data['usage'] check is redundant. Using data.get('usage') achieves the same result more concisely.

♻️ Proposed fix

-            if 'usage' in data and data['usage']:
-                u = data['usage']
+            usage_data = data.get('usage')
+            if usage_data:
                 usage = Usage(
-                    input_tokens=u.get('prompt_tokens', 0),
-                    output_tokens=u.get('completion_tokens', 0),
-                    total_tokens=u.get('total_tokens', 0)
+                    input_tokens=usage_data.get('prompt_tokens', 0),
+                    output_tokens=usage_data.get('completion_tokens', 0),
+                    total_tokens=usage_data.get('total_tokens', 0)
                 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/docker.py` around lines 515 - 521, Replace the
redundant check "if 'usage' in data and data['usage']" with a single call using
data.get('usage') to retrieve the usage dict (e.g., u = data.get('usage')), then
proceed to construct the Usage(...) only when u is truthy; update the block
around the variables data, u, and Usage to use this simplified check so behavior
remains identical but the code is more concise.

454-474: Add exception chaining in error handlers.

Multiple except blocks re-raise without preserving the original exception chain. Adding from e or from None improves debuggability.

♻️ Proposed fix for exception chaining

             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Connection error: {e.reason}. "
                         f"Is Docker Model Runner running at "
                         f"{self._api_url}?",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
         except socket.timeout:
             raise LLMClientError(LLMError(
                 message="Request timed out.",
                 code='timeout',
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/docker.py` around lines 454 - 474, The except
blocks that raise LLMClientError wrapping an LLMError (the urllib.error.URLError
and socket.timeout handlers and the earlier exception handler that uses variable
e) must preserve exception chaining so the original traceback is kept; update
each raise to use exception chaining (e.g., raise LLMClientError(LLMError(...))
from e for handlers that capture exception as e, and use an explicit "from None"
only where you intentionally want to suppress context), targeting the raise
statements that construct LLMClientError/LLMError in the docker provider method
where urllib.error.URLError, socket.timeout, and the earlier exception variable
e are handled.

581-594: Consider mirroring the MAX_TOKENS truncation check from _parse_response.

The empty-stream check is good, but _parse_response() (lines 344-372) has additional logic that distinguishes MAX_TOKENS truncation with a specific error message including token counts. The streaming path could benefit from similar handling for better user feedback when responses are truncated.

♻️ Example enhancement

         if not content and not tool_calls:
+            if stop_reason == StopReason.MAX_TOKENS:
+                raise LLMClientError(LLMError(
+                    message=(
+                        f'Response truncated due to token limit '
+                        f'(input: {usage.input_tokens} tokens). '
+                        f'Try using a model with a larger context '
+                        f'window, or analyze a smaller scope.'
+                    ),
+                    code='max_tokens',
+                    provider=self.provider_name,
+                    retryable=False
+                ))
             raise LLMClientError(LLMError(
                 message='No response content returned from API',
                 provider=self.provider_name,
                 retryable=False
             ))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/docker.py` around lines 581 - 594, When handling
the streaming path after the empty-stream check, mirror the MAX_TOKENS
truncation handling from _parse_response by detecting when stop_reason (or
usage) indicates the model hit the token limit (compare against MAX_TOKENS) and
surface the same specific LLMClientError message that includes token counts;
update the block that currently yields LLMResponse (referencing content,
tool_calls, stop_reason, model_name, usage) to either raise LLMClientError with
the detailed "truncated by MAX_TOKENS" message (like _parse_response) or augment
the yielded response to include that truncation detail so callers get identical
feedback. Ensure you use the same MAX_TOKENS constant and message format as in
_parse_response and preserve existing behavior for non-truncated responses.

web/pgadmin/llm/providers/anthropic.py (2)

308-314: Add exception chaining to preserve traceback context.

When re-raising exceptions, use from e (or from None if intentionally suppressing) to maintain the exception chain. This helps with debugging by preserving the original traceback.

♻️ Proposed fix

         try:
             yield from self._process_stream(payload)
         except LLMClientError:
             raise
         except Exception as e:
             raise LLMClientError(LLMError(
                 message=f"Streaming request failed: {e}",
                 provider=self.provider_name
-            ))
+            )) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/anthropic.py` around lines 308 - 314, The except
block catching Exception in the Anthropc provider currently raises a new
LLMClientError without chaining the original exception; update the catch in the
method handling streaming (the block referencing LLMClientError, LLMError and
self.provider_name) to re-raise the new LLMClientError using "from e" (i.e.,
raise LLMClientError(LLMError(...)) from e) so the original traceback is
preserved for debugging.

337-364: Add exception chaining to all exception handlers.

All three exception handlers re-raise without chaining, which loses the original traceback context.

♻️ Proposed fix

         except urllib.error.HTTPError as e:
             error_body = e.read().decode('utf-8')
             try:
                 error_data = json.loads(error_body)
                 error_msg = error_data.get(
                     'error', {}
                 ).get('message', str(e))
             except json.JSONDecodeError:
                 error_msg = error_body or str(e)
             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Connection error: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
         except socket.timeout:
             raise LLMClientError(LLMError(
                 message="Request timed out.",
                 code='timeout',
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from None

Note: For socket.timeout, use from None since the original exception doesn't add useful context (the message already explains it's a timeout).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/anthropic.py` around lines 337 - 364, The exception
handlers for urllib.error.HTTPError, urllib.error.URLError, and socket.timeout
in the anthropic provider currently re-raise LLMClientError without chaining the
original exception and thus lose traceback; update the three raises so the
HTTPError and URLError handlers use "raise LLMClientError(... ) from e"
(referencing urllib.error.HTTPError and urllib.error.URLError and the
LLMClientError/LLMError construction and self.provider_name) and change the
socket.timeout handler to "raise LLMClientError(... ) from None" per the note so
the original exception context is preserved or suppressed appropriately.

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)

326-332: Add a tool_use streaming scenario.

This generator never emits ('tool_use', ...), so the new branch that should surface a second thinking event is still untested. One scenario that injects a tool-use tuple between chunks would lock down the SSE protocol added in this PR.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 326 - 332,
The mock_stream_gen generator never emits a ('tool_use', ...) tuple so the new
branch that should produce a second "thinking" event is not exercised; update
mock_stream_gen to yield the same text chunks but inject a ('tool_use', <tool
payload>) tuple between chunk yields (e.g., after the first or second chunk)
before yielding the final (self.mock_response, []) tuple so the test for the SSE
tool_use streaming path and the extra thinking event is exercised; refer to
mock_stream_gen and the existing yields to locate where to insert the
('tool_use', ...) emission.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2875-2877: The handler currently discards the messages returned by
chat_with_database_stream() and only emits a generated conversation_id, losing
tool-call history; update the request flow in the sqleditor handler to persist
or round-trip the returned conversation history: capture the (response_text,
messages) tuple from chat_with_database_stream(), then either (a) store messages
keyed by conversation_id in a persistent/in-memory store (so future requests can
load conversation_history by conversation_id) or (b) include the messages in the
JSON response to the client and accept conversation_history on the next request;
apply the same change to the identical logic block referenced around lines
2919-2954 so conversation_history is preserved across turns (use the symbols
conversation_id, conversation_history, and chat_with_database_stream to locate
and modify the code).

---

Duplicate comments:
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 185-242: The test expectations and the SQL-joining helper in
test_nlq_chat.py must mirror production normalization: change the join that
builds sql from sql_blocks to strip trailing semicolons (use
block.strip().rstrip(';') when joining into the variable sql) and update
expected_sql strings for the cases 'SQL Extraction - Single SQL block', 'SQL
Extraction - pgsql language tag', 'SQL Extraction - postgresql language tag' and
the multiline case so they no longer include a trailing semicolon (e.g. 'SELECT
* FROM users' instead of 'SELECT * FROM users;'); apply the same change to the
other helper occurrence referenced around lines 253-261.

---

Nitpick comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 308-314: The except block catching Exception in the Anthropc
provider currently raises a new LLMClientError without chaining the original
exception; update the catch in the method handling streaming (the block
referencing LLMClientError, LLMError and self.provider_name) to re-raise the new
LLMClientError using "from e" (i.e., raise LLMClientError(LLMError(...)) from e)
so the original traceback is preserved for debugging.
- Around line 337-364: The exception handlers for urllib.error.HTTPError,
urllib.error.URLError, and socket.timeout in the anthropic provider currently
re-raise LLMClientError without chaining the original exception and thus lose
traceback; update the three raises so the HTTPError and URLError handlers use
"raise LLMClientError(... ) from e" (referencing urllib.error.HTTPError and
urllib.error.URLError and the LLMClientError/LLMError construction and
self.provider_name) and change the socket.timeout handler to "raise
LLMClientError(... ) from None" per the note so the original exception context
is preserved or suppressed appropriately.

In `@web/pgadmin/llm/providers/docker.py`:
- Around line 51-64: The _validate_loopback_url function currently raises
ValueError which can leak to callers; convert that into the module's
LLMClientError before it escapes by catching ValueError in the DockerClient
initialization (DockerClient.__init__) or by having get_llm_client() wrap the
call that triggers _validate_loopback_url in a try/except that catches
ValueError and re-raises LLMClientError with a clear message; ensure you
reference _validate_loopback_url, DockerClient, get_llm_client, and
LLMClientError so callers always receive LLMClientError for validation failures.
- Around line 418-422: The except block that currently raises
LLMClientError(LLMError(...)) should preserve the original exception traceback
by using exception chaining; modify the raise to use "raise
LLMClientError(LLMError(message=f'Streaming request failed: {str(e)}',
provider=self.provider_name)) from e" so the original exception e is attached
for better debugging.
- Around line 515-521: Replace the redundant check "if 'usage' in data and
data['usage']" with a single call using data.get('usage') to retrieve the usage
dict (e.g., u = data.get('usage')), then proceed to construct the Usage(...)
only when u is truthy; update the block around the variables data, u, and Usage
to use this simplified check so behavior remains identical but the code is more
concise.
- Around line 454-474: The except blocks that raise LLMClientError wrapping an
LLMError (the urllib.error.URLError and socket.timeout handlers and the earlier
exception handler that uses variable e) must preserve exception chaining so the
original traceback is kept; update each raise to use exception chaining (e.g.,
raise LLMClientError(LLMError(...)) from e for handlers that capture exception
as e, and use an explicit "from None" only where you intentionally want to
suppress context), targeting the raise statements that construct
LLMClientError/LLMError in the docker provider method where
urllib.error.URLError, socket.timeout, and the earlier exception variable e are
handled.
- Around line 581-594: When handling the streaming path after the empty-stream
check, mirror the MAX_TOKENS truncation handling from _parse_response by
detecting when stop_reason (or usage) indicates the model hit the token limit
(compare against MAX_TOKENS) and surface the same specific LLMClientError
message that includes token counts; update the block that currently yields
LLMResponse (referencing content, tool_calls, stop_reason, model_name, usage) to
either raise LLMClientError with the detailed "truncated by MAX_TOKENS" message
(like _parse_response) or augment the yielded response to include that
truncation detail so callers get identical feedback. Ensure you use the same
MAX_TOKENS constant and message format as in _parse_response and preserve
existing behavior for non-truncated responses.

In `@web/pgadmin/llm/providers/openai.py`:
- Around line 383-387: The exception handling in the OpenAI streaming code wraps
errors into LLMClientError/LLMError but drops the original traceback; update the
except blocks (e.g. the handler that currently does "except Exception as e:
raise LLMClientError(LLMError(...))" which references self.provider_name and
constructs LLMError) to re-raise using "raise LLMClientError(LLMError(...)) from
e" so the original exception chain is preserved; apply the same change to all
similar streaming-method handlers in this file that catch Exception and raise
LLMClientError.
- Around line 550-555: The empty-response check currently raises a generic
LLMClientError; update it to mirror the non-streaming _parse_response logic by
inspecting the response's finish_reason (and any available model error metadata)
before raising so you emit the same detailed guidance (e.g., MAX_TOKENS /
context-limit hints) and include provider=self.provider_name and any diagnostic
fields; modify the block that raises LLMClientError (the conditional that checks
content and tool_calls) to build the LLMError message using the same
finish_reason-driven messages and details used in _parse_response so streaming
and non-streaming paths produce consistent errors.
- Around line 418-436: The exception handlers that raise LLMClientError/LLMError
should preserve original tracebacks by using exception chaining: in the
HTTP/URLError handlers add "raise LLMClientError(... ) from e" (they already
reference e), and change "except socket.timeout:" to "except socket.timeout as
e:" then raise the LLMClientError with "from e". Ensure this is applied to the
handlers that build LLMError (the ones referencing e.code, the
urllib.error.URLError handler, and the socket.timeout handler) so the original
exception is chained to the new LLMClientError.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 326-332: The mock_stream_gen generator never emits a ('tool_use',
...) tuple so the new branch that should produce a second "thinking" event is
not exercised; update mock_stream_gen to yield the same text chunks but inject a
('tool_use', <tool payload>) tuple between chunk yields (e.g., after the first
or second chunk) before yielding the final (self.mock_response, []) tuple so the
test for the SSE tool_use streaming path and the extra thinking event is
exercised; refer to mock_stream_gen and the existing yields to locate where to
insert the ('tool_use', ...) emission.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 99402f45-993b-4c75-a066-600d0fd81ef5

📥 Commits

Reviewing files that changed from the base of the PR and between fc0af6f and 04b846c.

📒 Files selected for processing (5)

web/pgadmin/llm/providers/anthropic.py
web/pgadmin/llm/providers/docker.py
web/pgadmin/llm/providers/openai.py
web/pgadmin/tools/sqleditor/__init__.py
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py

coderabbitai

Actionable comments posted: 6

♻️ Duplicate comments (1)

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)

259-261: ⚠️ Potential issue | 🟡 Minor

Keep the test extractor in sync with the endpoint.

The helper here still joins raw blocks, so the multi-block case reintroduces the double-semicolon output and the pipeline is already failing on it. Mirror the production cleanup before joining.

🧪 Suggested fix

-        sql = ';\n\n'.join(
-            block.strip() for block in sql_blocks
-        ) if sql_blocks else None
+        sql = ';\n\n'.join(
+            block.strip().rstrip(';') for block in sql_blocks
+        ) if sql_blocks else None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 259 - 261,
The test helper constructs sql from sql_blocks but doesn't apply the production
cleanup, causing double-semicolons; update the logic that builds sql (the sql
variable from sql_blocks) to mirror the endpoint's cleanup: trim each block,
remove any trailing semicolons/empty blocks (e.g., strip() then rstrip(';') and
skip empties), then join the cleaned blocks with ';\n\n' (or call the shared
cleanup helper if one exists) so the multi-block output matches production.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/pgadmin/llm/chat.py`:
- Around line 166-179: The current generator emits ambiguous 2-tuples for both
tool events and final completions which causes misclassification (e.g., in
nlq_chat_stream); change the event shapes so they are distinct: emit a
clearly-typed tool event like ('tool_use', tool_events_list) and a distinct
completion event like ('complete', final_text, updated_messages) (or use a
simple Event namedtuple/dataclass) wherever the generator yields tool events and
the final result (references: the generator function in web/pgadmin/llm/chat.py
that yields tool-use tuples and the nlq_chat_stream consumer); update consumers
to match the new 3-tuple completion shape and handle the explicit 'tool_use' and
'complete' tags accordingly.

In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2896-2903: The call to chat_with_database_stream currently omits
the conversation history so follow-up NLQ turns lose context; update the call to
pass the request's history into the conversation_history parameter (e.g.,
conversation_history=request.history or the existing history variable in scope)
alongside user_message, sid, did, and system_prompt so chat_with_database_stream
receives prior messages for proper context.
- Around line 2924-2931: The regex used in the re.findall call that populates
sql_blocks is case-sensitive and misses fenced blocks with uppercase fence tags;
update the re.findall invocation in __init__.py (the call that assigns
sql_blocks) to add the re.IGNORECASE flag alongside re.DOTALL so the pattern
(?:sql|pgsql|postgresql)\s*\n(.*?)``` matches tags like SQL/POSTGRESQL as well,
then leave the subsequent sql construction (the sql variable) unchanged.

In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`:
- Around line 874-880: The current streaming branch inside the component uses
streamingTextRef.current and setMessages to replace the assistant message by
appending '\n\n(Generation stopped)' to the partial markdown, which can end up
inside an open fenced code block; instead preserve the partial response exactly
(do not append the stop notice to streamingTextRef.current) and call setMessages
twice (or add two entries) so you first update/commit the assistant message with
content: streamingTextRef.current and type MESSAGE_TYPES.ASSISTANT (filtering
out thinkingId and streamId as before), then append a separate assistant-status
message whose content is gettext('(Generation stopped)') (or use a distinct type
if you prefer) so the stop marker is rendered as a standalone message; apply the
same pattern to the analogous block handling stream end around
streamId/thinkingId (the other occurrence at the 905-910 region).
- Around line 552-556: The MarkdownContent is being forced into a span
(component="span") while renderMarkdownText(content) can output block-level
HTML, causing invalid DOM; change MarkdownContent in NLQChatPanel.jsx to render
as a block-level container (remove or set component to a block element) so block
nodes (paragraphs, lists, tables) are not injected into a span, and render the
streaming cursor separately (keep the Box wrapper and the cursor rendering logic
outside/adjacent to MarkdownContent) so streaming markdown displays correctly.
- Around line 1035-1043: Snapshot streamingIdRef.current into a local (e.g.,
const streamingId = streamingIdRef.current) before calling setMessages so the
functional updater filters using that captured id (alongside thinkingId) instead
of reading streamingIdRef.current later; then update setMessages to filter out
that local streamingId and add the error message (MESSAGE_TYPES.ERROR), and only
after setMessages clear streamingTextRef.current and set streamingIdRef.current
= null to keep behavior consistent with the abort/complete paths.

---

Duplicate comments:
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 259-261: The test helper constructs sql from sql_blocks but
doesn't apply the production cleanup, causing double-semicolons; update the
logic that builds sql (the sql variable from sql_blocks) to mirror the
endpoint's cleanup: trim each block, remove any trailing semicolons/empty blocks
(e.g., strip() then rstrip(';') and skip empties), then join the cleaned blocks
with ';\n\n' (or call the shared cleanup helper if one exists) so the
multi-block output matches production.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 03a8611a-98cf-454d-94d9-14ef26b9bbe7

📥 Commits

Reviewing files that changed from the base of the PR and between 04b846c and 82a9bc6.

📒 Files selected for processing (14)

docs/en_US/release_notes_9_14.rst
web/pgadmin/llm/chat.py
web/pgadmin/llm/client.py
web/pgadmin/llm/prompts/nlq.py
web/pgadmin/llm/providers/anthropic.py
web/pgadmin/llm/providers/docker.py
web/pgadmin/llm/providers/ollama.py
web/pgadmin/llm/providers/openai.py
web/pgadmin/static/js/Theme/dark.js
web/pgadmin/static/js/Theme/high_contrast.js
web/pgadmin/static/js/Theme/light.js
web/pgadmin/tools/sqleditor/__init__.py
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py

✅ Files skipped from review due to trivial changes (1)

docs/en_US/release_notes_9_14.rst

🚧 Files skipped from review as they are similar to previous changes (3)

web/pgadmin/static/js/Theme/high_contrast.js
web/pgadmin/static/js/Theme/light.js
web/pgadmin/llm/client.py

web/pgadmin/llm/chat.py

web/pgadmin/tools/sqleditor/__init__.py

web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx

coderabbitai

Actionable comments posted: 6

♻️ Duplicate comments (1)

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)
185-208: ⚠️ Potential issue | 🟠 Major

Align these fixtures with the production SQL normalizer.

web/pgadmin/tools/sqleditor/__init__.py now joins fenced SQL blocks with block.strip().rstrip(';'), but this helper still uses block.strip(). That means the multi-block cases here assert a different sql string than the endpoint emits, and the single-block expectations still keep a trailing semicolon that runtime no longer returns.
🔧 Representative fix
-            expected_sql='SELECT * FROM users;'
+            expected_sql='SELECT * FROM users'
@@
         sql = ';\n\n'.join(
-            block.strip() for block in sql_blocks
+            block.strip().rstrip(';') for block in sql_blocks
         ) if sql_blocks else None
Update the other single-block expectations the same way.
Also applies to: 253-261
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 185 - 208,
Update the test fixtures in test_nlq_chat.py so expected_sql matches the
production normalizer (which calls block.strip().rstrip(';')): for the 'SQL
Extraction - Single SQL block' and other single-block cases remove the trailing
semicolon from expected_sql, and for multi-block cases like 'SQL Extraction -
Multiple SQL blocks' remove trailing semicolons from each fenced block so the
joined expectation is
"block1_without_trailing_semicolon\n\nblock2_without_trailing_semicolon". Apply
the same change to the other similar fixtures (e.g., the pgsql/postgresql cases
and the duplicated fixtures mentioned) so all expected_sql values reflect
block.strip().rstrip(';') normalization.

🧹 Nitpick comments (2)

web/pgadmin/llm/providers/anthropic.py (2)

320-324: Improve exception chaining for better debugging.

When re-raising exceptions, use from e to preserve the exception chain and provide better tracebacks for debugging.

♻️ Proposed fix

         except Exception as e:
             raise LLMClientError(LLMError(
-                message=f"Streaming request failed: {str(e)}",
+                message=f"Streaming request failed: {e}",
                 provider=self.provider_name
-            ))
+            )) from e

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/anthropic.py` around lines 320 - 324, The except
block in the streaming path currently raises a new LLMClientError/LLMError
without exception chaining, losing the original traceback; update the raise in
the except Exception as e block inside anthropic.py to re-raise the
LLMClientError using "from e" so the original exception is preserved (i.e.,
raise LLMClientError(LLMError(...)) from e), referencing the existing
LLMClientError, LLMError and self.provider_name symbols.

356-374: Add exception chaining for better error traceability.

Multiple exception handlers re-raise without chaining, losing the original traceback context.

♻️ Proposed fix

             raise LLMClientError(LLMError(
                 message=error_msg,
                 code=str(e.code),
                 provider=self.provider_name,
                 retryable=e.code in (429, 500, 502, 503, 504)
-            ))
+            )) from e
         except urllib.error.URLError as e:
             raise LLMClientError(LLMError(
                 message=f"Connection error: {e.reason}",
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from e
         except socket.timeout:
             raise LLMClientError(LLMError(
                 message="Request timed out.",
                 code='timeout',
                 provider=self.provider_name,
                 retryable=True
-            ))
+            )) from None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/anthropic.py` around lines 356 - 374, The exception
handlers that raise LLMClientError currently drop the original traceback; update
each except to chain the original exception using "raise ... from e" (bind
socket.timeout as e in "except socket.timeout as e:") so the original exception
context is preserved; specifically modify the handlers that construct LLMError
and raise LLMClientError (the urllib.error.URLError block, the socket.timeout
block, and any nearby blocks that use the variable e) to use exception chaining,
keeping provider=self.provider_name and retryable flags unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 336-341: The request constructor references an undefined global
API_URL causing a NameError; update the code that builds the request (the
urllib.request.Request call inside the method that sends Anthropic requests) to
use the instance attribute self._api_url (the attribute set in __init__ of the
Anthropic provider class) instead of API_URL, and scan for any other occurrences
of API_URL in this class to replace with self._api_url so the instance
configuration is used.
- Around line 330-334: The headers dict in the method that builds requests (the
block creating headers with 'Content-Type', 'x-api-key', and
'anthropic-version') always injects x-api-key which breaks custom endpoints;
update that method so it only adds the 'x-api-key' header when self._api_key is
present (truthy), mirroring the behavior in _make_request(), so custom endpoints
without auth are not sent an API key.

In `@web/pgadmin/llm/providers/ollama.py`:
- Around line 421-449: The code currently sets final metadata only when
data.get('done') is true but always yields a final LLMResponse; change this so
the final LLMResponse is only built and yielded when a terminal frame was seen
(i.e., final_data is set or data.get('done') is True). In practice, wrap the
block that computes stop_reason, model_name/input/output tokens and yields
LLMResponse (the yield of LLMResponse with content, tool_calls, stop_reason,
model, usage, raw_response) in a conditional that verifies final_data/done is
present; if no terminal done frame was received, do not emit that “successful”
final response (instead return/stop without yielding it).
- Around line 332-342: OllamaClient.__init__ currently accepts any URL scheme
which can lead to unsafe local/resource access when urllib.request.urlopen is
called; add scheme validation in OllamaClient.__init__ that parses the provided
api_url (self._api_url) and only allows safe schemes (e.g., "http" and "https")
following the same pattern used in DockerClient, and raise a clear ValueError if
the scheme is missing or not allowed; ensure this check runs before any use of
urllib.request.urlopen in methods that call the API (e.g., where url is
constructed for POST in the request flow).

In `@web/pgadmin/llm/providers/openai.py`:
- Around line 405-415: The _process_stream() function is using an undefined
API_URL and always sets an Authorization header, which causes NameError and
prevents custom endpoints without an API key from working; update
_process_stream() to use the same request URL variable the class uses elsewhere
(the instance's base URL/endpoint field used by _make_request(), e.g.,
self._base_url or the same self attribute that _make_request() composes) and
build headers the same way as _make_request() by only adding 'Authorization:
Bearer {self._api_key}' when self._api_key is present/non-empty, otherwise omit
the header so custom endpoints without keys are supported.

In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2935-2972: The completion path currently emits only response_text
and a synthetic new_conversation_id without persisting the provided messages
from chat_with_database_stream, so implement persisting or round-tripping there:
when new_conversation_id is assigned (the block setting new_conversation_id)
update the server-side conversation store keyed by that ID (or create one if
missing) with the current messages list passed into this handler, or
alternatively include a serialized messages payload in the dict passed to
_nlq_sse_event under a key like 'history' so the client can round-trip it;
locate the logic around chat_with_database_stream, conversation_id, messages and
the yield _nlq_sse_event call and add the persistence or payload insertion
accordingly.

---

Duplicate comments:
In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 185-208: Update the test fixtures in test_nlq_chat.py so
expected_sql matches the production normalizer (which calls
block.strip().rstrip(';')): for the 'SQL Extraction - Single SQL block' and
other single-block cases remove the trailing semicolon from expected_sql, and
for multi-block cases like 'SQL Extraction - Multiple SQL blocks' remove
trailing semicolons from each fenced block so the joined expectation is
"block1_without_trailing_semicolon\n\nblock2_without_trailing_semicolon". Apply
the same change to the other similar fixtures (e.g., the pgsql/postgresql cases
and the duplicated fixtures mentioned) so all expected_sql values reflect
block.strip().rstrip(';') normalization.

---

Nitpick comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 320-324: The except block in the streaming path currently raises a
new LLMClientError/LLMError without exception chaining, losing the original
traceback; update the raise in the except Exception as e block inside
anthropic.py to re-raise the LLMClientError using "from e" so the original
exception is preserved (i.e., raise LLMClientError(LLMError(...)) from e),
referencing the existing LLMClientError, LLMError and self.provider_name
symbols.
- Around line 356-374: The exception handlers that raise LLMClientError
currently drop the original traceback; update each except to chain the original
exception using "raise ... from e" (bind socket.timeout as e in "except
socket.timeout as e:") so the original exception context is preserved;
specifically modify the handlers that construct LLMError and raise
LLMClientError (the urllib.error.URLError block, the socket.timeout block, and
any nearby blocks that use the variable e) to use exception chaining, keeping
provider=self.provider_name and retryable flags unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f4251b5-b9d7-4279-a5a7-eb7ffb65d1a8

📥 Commits

Reviewing files that changed from the base of the PR and between 82a9bc6 and cbbc025.

📒 Files selected for processing (14)

docs/en_US/release_notes_9_14.rst
web/pgadmin/llm/chat.py
web/pgadmin/llm/client.py
web/pgadmin/llm/prompts/nlq.py
web/pgadmin/llm/providers/anthropic.py
web/pgadmin/llm/providers/docker.py
web/pgadmin/llm/providers/ollama.py
web/pgadmin/llm/providers/openai.py
web/pgadmin/static/js/Theme/dark.js
web/pgadmin/static/js/Theme/high_contrast.js
web/pgadmin/static/js/Theme/light.js
web/pgadmin/tools/sqleditor/__init__.py
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py

🚧 Files skipped from review as they are similar to previous changes (4)

web/pgadmin/llm/prompts/nlq.py
web/pgadmin/static/js/Theme/light.js
web/pgadmin/llm/client.py
docs/en_US/release_notes_9_14.rst

web/pgadmin/llm/providers/anthropic.py

web/pgadmin/llm/providers/ollama.py

web/pgadmin/llm/providers/openai.py

web/pgadmin/tools/sqleditor/__init__.py

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx (1)

799-824: ⚠️ Potential issue | 🟠 Major

Keep the timeout and abort controller alive until the SSE body is fully consumed.

These lines run as soon as headers arrive, so the 5-minute timeout no longer protects the streaming read loop and handleStop() loses the AbortController while reader.read() is still active. Clean them up in the terminal paths / finally instead.

Suggested fix

-      clearTimeout(timeoutId);
-      abortControllerRef.current = null;
-
       if (!response.ok) {
         const errorData = await response.json().catch(() => ({}));
         throw new Error(errorData.errormsg || `HTTP error! status: ${response.status}`);
       }
@@
     } catch (error) {
-      clearTimeout(timeoutId);
-      abortControllerRef.current = null;
       readerRef.current = null;
       const streamId = streamingIdRef.current;
@@
     } finally {
+      clearTimeout(timeoutId);
+      abortControllerRef.current = null;
       setIsLoading(false);
       setThinkingMessageId(null);
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`
around lines 799 - 824, The current code clears timeout and nulls
abortControllerRef as soon as fetch returns headers, which lets handleStop lose
the AbortController while the streaming reader (reader.read() loop) is still
active; instead, keep controller and timeoutId alive until the SSE body is fully
consumed and only call clearTimeout(timeoutId) and set
abortControllerRef.current = null in the terminal paths (after the streaming
read loop finishes or in a finally block) so handleStop can still abort the
ongoing reader; update the logic around the fetch response handling and reader
loop in NLQChatPanel.jsx to perform cleanup at the end of the stream and not
immediately after fetch returns.

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)

129-139: ⚠️ Potential issue | 🟠 Major

Make the error scenarios assert the error payload unconditionally.

A non-JSON or otherwise unexpected error response still passes here because the assertions only run inside the guarded if. Once self.expected_error is true, this test should fail as soon as the endpoint stops returning the expected JSON error shape.

Suggested fix

             if self.expected_error:
-                # For error cases, we expect JSON response
-                if response.status_code == 200 and \
-                   response.content_type == 'application/json':
-                    data = json.loads(response.data)
-                    self.assertFalse(data.get('success', True))
-                    if hasattr(self, 'error_contains'):
-                        self.assertIn(
-                            self.error_contains,
-                            data.get('errormsg', '')
-                        )
+                self.assertEqual(response.mimetype, 'application/json')
+                data = json.loads(response.data)
+                self.assertFalse(data.get('success', True))
+                if hasattr(self, 'error_contains'):
+                    self.assertIn(
+                        self.error_contains,
+                        data.get('errormsg', '')
+                    )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 129 - 139,
When self.expected_error is true, don't conditionally skip assertions based on
response.content_type/status; instead assert that the response is JSON and has
the expected error shape unconditionally. In the test block guarded by
self.expected_error in test_nlq_chat.py, replace the inner "if
response.status_code == 200 and response.content_type == 'application/json':"
guard with explicit assertions (e.g., assert response.content_type ==
'application/json' and assert response.status_code == 200 or another expected
error code), then json.loads(response.data) and assert data.get('success') is
False and, if hasattr(self, 'error_contains'), assert self.error_contains is in
data.get('errormsg', '') so any non-JSON or malformed error response will cause
the test to fail.

♻️ Duplicate comments (6)

web/pgadmin/llm/providers/anthropic.py (1)

330-341: ⚠️ Potential issue | 🔴 Critical

Use self._api_url and mirror _make_request()'s auth handling here.

API_URL is undefined, so every Anthropic streaming request fails before it is sent. This branch also always adds x-api-key, which breaks the custom endpoints that the non-streaming path already supports without auth.

🐛 Proposed fix

         headers = {
             'Content-Type': 'application/json',
-            'x-api-key': self._api_key,
             'anthropic-version': API_VERSION
         }
+        if self._api_key:
+            headers['x-api-key'] = self._api_key
 
         request = urllib.request.Request(
-            API_URL,
+            self._api_url,
             data=json.dumps(payload).encode('utf-8'),
             headers=headers,
             method='POST'
         )

Run this to compare the sync and streaming request builders:

#!/bin/bash
sed -n '200,215p;330,341p' web/pgadmin/llm/providers/anthropic.py

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/anthropic.py` around lines 330 - 341, The streaming
request builder is using an undefined API_URL and always injects 'x-api-key',
causing failures and breaking custom endpoints; update the block constructing
the streaming request (the Request creation near the top of the streaming path)
to use self._api_url instead of API_URL and mirror the authentication/header
logic used in _make_request(): build headers the same way (only include
'x-api-key' when appropriate for default Anthropic endpoints and preserve custom
endpoint behavior), and ensure 'anthropic-version' and 'Content-Type' are set
consistently with _make_request() so streaming and non-streaming requests behave
identically.

web/pgadmin/tools/sqleditor/__init__.py (1)

2935-2938: ⚠️ Potential issue | 🟠 Major

Don't drop the updated hidden history on completion.

chat_with_database_stream() returns the updated messages in its 'complete' tuple, but this handler only keeps item[1] and then emits a synthetic conversation_id. Without persisting or round-tripping item[2], follow-up turns lose tool calls/results and the returned ID cannot actually resume the full conversation state.

Also applies to: 2960-2971
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/__init__.py` around lines 2935 - 2938, The
completion branch currently extracts only response_text from the ('complete',
response_text, messages) tuple and emits a synthetic conversation_id, which
drops the updated hidden history (item[2]) and prevents resuming tool state;
update the handler in chat_with_database_stream's consumer to capture item[2]
(messages) and persist or round-trip it with the emitted completion (e.g.,
include messages in the persisted conversation record or return it alongside
conversation_id), ensuring the updated messages/hidden history are stored or
returned so follow-up turns can resume tool calls/results instead of losing
state.

web/pgadmin/llm/providers/openai.py (1)

405-415: ⚠️ Potential issue | 🔴 Critical

Use self._api_url and keep auth handling consistent with _make_request().

API_URL is undefined, so every OpenAI streaming request fails immediately. This branch also always sends Authorization, which breaks the custom endpoints that chat() already supports without an API key.

🐛 Proposed fix

         headers = {
             'Content-Type': 'application/json',
-            'Authorization': f'Bearer {self._api_key}'
         }
+        if self._api_key:
+            headers['Authorization'] = f'Bearer {self._api_key}'
 
         request = urllib.request.Request(
-            API_URL,
+            self._api_url,
             data=json.dumps(payload).encode('utf-8'),
             headers=headers,
             method='POST'
         )

Run this to compare the sync and streaming request builders:

#!/bin/bash
sed -n '208,222p;405,415p' web/pgadmin/llm/providers/openai.py

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/openai.py` around lines 405 - 415, The streaming
request is using an undefined API_URL and always adds an Authorization header,
breaking custom endpoints; update the Request builder to use self._api_url (not
API_URL) and mirror the auth handling used in _make_request(): only include the
Authorization header when self._api_key is present (or follow whatever header
logic _make_request() uses), and ensure Content-Type remains application/json;
also reuse any additional headers or auth-related logic from _make_request() so
chat() and the streaming path behave consistently.

web/pgadmin/llm/providers/ollama.py (2)

39-48: ⚠️ Potential issue | 🟠 Major

Validate api_url before storing it.

Every request path later feeds self._api_url into urllib.request.urlopen(), so file: and other unexpected schemes are still accepted here. Restrict this to http/https up front.

Suggested fix

 import json
 import urllib.request
 import urllib.error
+from urllib.parse import urlparse
 from collections.abc import Generator
 from typing import Optional, Union
 import uuid
@@
         """
         self._api_url = api_url.rstrip('/')
+        parsed = urlparse(self._api_url)
+        if parsed.scheme not in ('http', 'https'):
+            raise ValueError(
+                'Invalid Ollama API URL scheme. Only http/https are allowed.'
+            )
         self._model = model or DEFAULT_MODEL

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/ollama.py` around lines 39 - 48, In Ollama client
__init__, validate the provided api_url before storing: parse api_url (e.g.,
using urllib.parse.urlparse), ensure scheme is either "http" or "https" and
raise a ValueError if not, then normalize (rstrip('/')) and assign to
self._api_url; leave model handling (self._model = model or DEFAULT_MODEL)
unchanged.

421-449: ⚠️ Potential issue | 🟠 Major

Only emit the final LLMResponse after a terminal done frame.

If the NDJSON stream drops early, final_data stays None but this still yields a "successful" response with truncated content. Fail fast instead of synthesizing completion.

Suggested fix

             if data.get('done'):
                 final_data = data
                 done_reason = data.get('done_reason', '')
                 model_name = data.get('model', self._model)
                 input_tokens = data.get('prompt_eval_count', 0)
                 output_tokens = data.get('eval_count', 0)
 
+        if final_data is None:
+            raise LLMClientError(LLMError(
+                message='Ollama stream ended before a terminal done frame',
+                provider=self.provider_name,
+                retryable=True
+            ))
+
         # Build final response
         if tool_calls:
             stop_reason = StopReason.TOOL_USE

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/llm/providers/ollama.py` around lines 421 - 449, The code
currently yields an LLMResponse even when final_data is None (stream dropped);
change the logic in the generator so you only construct and yield the
LLMResponse when a terminal frame is observed (i.e., when data.get('done') sets
final_data and done_reason); if final_data is still None after the read loop,
raise an error or return without emitting a synthesized completion instead of
yielding using default values—update the block that builds stop_reason and
yields LLMResponse to first check final_data (and/or done_reason) and fail fast
if the stream ended prematurely, referencing final_data, data.get('done'),
done_reason, content_parts, tool_calls and LLMResponse.

web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py (1)

259-261: ⚠️ Potential issue | 🟠 Major

Keep the test extractor aligned with the endpoint's semicolon normalization.

The backend strips trailing semicolons before joining fenced SQL blocks, but this helper still joins block.strip() verbatim. The multi-block case will produce ;; here and drift from the real extraction path.

Suggested fix

-        sql = ';\n\n'.join(
-            block.strip() for block in sql_blocks
-        ) if sql_blocks else None
+        sql = ';\n\n'.join(
+            block.strip().rstrip(';') for block in sql_blocks
+        ) if sql_blocks else None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py` around lines 259 - 261,
The test builds the joined SQL with "';\\n\\n'.join(block.strip() for block in
sql_blocks)" but the backend first strips trailing semicolons from each fenced
block, so update the join to remove trailing semicolons from each block before
trimming and joining (i.e., for each item in sql_blocks, remove trailing ';'
then strip) so the test extractor matches the endpoint's semicolon normalization
when producing the "sql" value from sql_blocks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/pgadmin/llm/chat.py`:
- Around line 157-166: The declared return type of chat_with_database_stream is
inaccurate: update its annotation to reflect the three yielded event shapes —
text chunks (str), completion events ('complete', response.content: str,
messages: list[Message]), and tool-use events ('tool_use', list[str]) — e.g.
Generator[Union[str, Tuple[str, str, List[Message]], Tuple[str, List[str]]],
None, None]; import or reference Tuple and List from typing as needed and ensure
the tuple element types match the actual yields in chat_with_database_stream.

In `@web/pgadmin/llm/providers/openai.py`:
- Around line 378-385: In chat_stream(), the payload always sets 'temperature'
which diverges from chat()'s compatibility behavior; update chat_stream() (in
the OpenAI provider class, method chat_stream) to only add the 'temperature' key
to the payload when temperature is non-default (e.g., temperature != 0.0 or when
explicitly provided), mirroring chat()'s conditional inclusion so endpoints that
don't accept temperature won't receive it.

In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`:
- Around line 997-1012: The handler currently places raw JSON/explanation into
content which prevents ChatMessage from falling back to message.sql; update the
'sql'/'complete' branch so when there is event.sql but no fenced code blocks you
extract an explanation from event.content (attempt JSON.parse and use
parsed.explanation, or use raw event.content on parse failure), then set content
= '' and include explanation on the MESSAGE_TYPES.SQL object (keep sql:
event.sql) so ChatMessage can render the legacy SQL preview/actions; reference
the event.sql/content variables and the MESSAGE_TYPES.SQL message object to
implement this change.

---

Outside diff comments:
In `@web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx`:
- Around line 799-824: The current code clears timeout and nulls
abortControllerRef as soon as fetch returns headers, which lets handleStop lose
the AbortController while the streaming reader (reader.read() loop) is still
active; instead, keep controller and timeoutId alive until the SSE body is fully
consumed and only call clearTimeout(timeoutId) and set
abortControllerRef.current = null in the terminal paths (after the streaming
read loop finishes or in a finally block) so handleStop can still abort the
ongoing reader; update the logic around the fetch response handling and reader
loop in NLQChatPanel.jsx to perform cleanup at the end of the stream and not
immediately after fetch returns.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 129-139: When self.expected_error is true, don't conditionally
skip assertions based on response.content_type/status; instead assert that the
response is JSON and has the expected error shape unconditionally. In the test
block guarded by self.expected_error in test_nlq_chat.py, replace the inner "if
response.status_code == 200 and response.content_type == 'application/json':"
guard with explicit assertions (e.g., assert response.content_type ==
'application/json' and assert response.status_code == 200 or another expected
error code), then json.loads(response.data) and assert data.get('success') is
False and, if hasattr(self, 'error_contains'), assert self.error_contains is in
data.get('errormsg', '') so any non-JSON or malformed error response will cause
the test to fail.

---

Duplicate comments:
In `@web/pgadmin/llm/providers/anthropic.py`:
- Around line 330-341: The streaming request builder is using an undefined
API_URL and always injects 'x-api-key', causing failures and breaking custom
endpoints; update the block constructing the streaming request (the Request
creation near the top of the streaming path) to use self._api_url instead of
API_URL and mirror the authentication/header logic used in _make_request():
build headers the same way (only include 'x-api-key' when appropriate for
default Anthropic endpoints and preserve custom endpoint behavior), and ensure
'anthropic-version' and 'Content-Type' are set consistently with _make_request()
so streaming and non-streaming requests behave identically.

In `@web/pgadmin/llm/providers/ollama.py`:
- Around line 39-48: In Ollama client __init__, validate the provided api_url
before storing: parse api_url (e.g., using urllib.parse.urlparse), ensure scheme
is either "http" or "https" and raise a ValueError if not, then normalize
(rstrip('/')) and assign to self._api_url; leave model handling (self._model =
model or DEFAULT_MODEL) unchanged.
- Around line 421-449: The code currently yields an LLMResponse even when
final_data is None (stream dropped); change the logic in the generator so you
only construct and yield the LLMResponse when a terminal frame is observed
(i.e., when data.get('done') sets final_data and done_reason); if final_data is
still None after the read loop, raise an error or return without emitting a
synthesized completion instead of yielding using default values—update the block
that builds stop_reason and yields LLMResponse to first check final_data (and/or
done_reason) and fail fast if the stream ended prematurely, referencing
final_data, data.get('done'), done_reason, content_parts, tool_calls and
LLMResponse.

In `@web/pgadmin/llm/providers/openai.py`:
- Around line 405-415: The streaming request is using an undefined API_URL and
always adds an Authorization header, breaking custom endpoints; update the
Request builder to use self._api_url (not API_URL) and mirror the auth handling
used in _make_request(): only include the Authorization header when
self._api_key is present (or follow whatever header logic _make_request() uses),
and ensure Content-Type remains application/json; also reuse any additional
headers or auth-related logic from _make_request() so chat() and the streaming
path behave consistently.

In `@web/pgadmin/tools/sqleditor/__init__.py`:
- Around line 2935-2938: The completion branch currently extracts only
response_text from the ('complete', response_text, messages) tuple and emits a
synthetic conversation_id, which drops the updated hidden history (item[2]) and
prevents resuming tool state; update the handler in chat_with_database_stream's
consumer to capture item[2] (messages) and persist or round-trip it with the
emitted completion (e.g., include messages in the persisted conversation record
or return it alongside conversation_id), ensuring the updated messages/hidden
history are stored or returned so follow-up turns can resume tool calls/results
instead of losing state.

In `@web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py`:
- Around line 259-261: The test builds the joined SQL with
"';\\n\\n'.join(block.strip() for block in sql_blocks)" but the backend first
strips trailing semicolons from each fenced block, so update the join to remove
trailing semicolons from each block before trimming and joining (i.e., for each
item in sql_blocks, remove trailing ';' then strip) so the test extractor
matches the endpoint's semicolon normalization when producing the "sql" value
from sql_blocks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cb85a712-2d91-4868-96c6-185e0724dba0

📥 Commits

Reviewing files that changed from the base of the PR and between cbbc025 and 2307535.

📒 Files selected for processing (14)

docs/en_US/release_notes_9_14.rst
web/pgadmin/llm/chat.py
web/pgadmin/llm/client.py
web/pgadmin/llm/prompts/nlq.py
web/pgadmin/llm/providers/anthropic.py
web/pgadmin/llm/providers/docker.py
web/pgadmin/llm/providers/ollama.py
web/pgadmin/llm/providers/openai.py
web/pgadmin/static/js/Theme/dark.js
web/pgadmin/static/js/Theme/high_contrast.js
web/pgadmin/static/js/Theme/light.js
web/pgadmin/tools/sqleditor/__init__.py
web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx
web/pgadmin/tools/sqleditor/tests/test_nlq_chat.py

🚧 Files skipped from review as they are similar to previous changes (3)

web/pgadmin/static/js/Theme/light.js
web/pgadmin/static/js/Theme/dark.js
web/pgadmin/static/js/Theme/high_contrast.js

web/pgadmin/llm/chat.py

web/pgadmin/llm/providers/openai.py

web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx

…y in the AI Assistant. Fixes pgadmin-org#9734

- Anthropic: preserve separators between text blocks in streaming to match _parse_response() behavior. - Docker: validate that the API URL points to a loopback address to constrain the request surface. - Docker/OpenAI: raise LLMClientError on empty streams instead of yielding blank LLMResponse objects, matching non-streaming behavior. - SQL extraction: strip trailing semicolons before joining blocks to avoid double semicolons in output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ing. - Use distinct 3-tuple ('complete', text, messages) for completion events to avoid ambiguity with ('tool_use', [...]) 2-tuples in chat streaming. - Pass conversation history from request into chat_with_database_stream() so follow-up NLQ turns retain context. - Add re.IGNORECASE to SQL fence regex for case-insensitive matching. - Render MarkdownContent as block element instead of span to avoid invalid DOM when response contains paragraphs, lists, or tables. - Keep stop notice as a separate message instead of appending to partial markdown, preventing it from being swallowed by open code fences. - Snapshot streamingIdRef before setMessages in error handler to avoid race condition where ref is cleared before React executes the updater. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix critical NameError: use self._api_url instead of undefined API_URL in anthropic and openai streaming _process_stream() methods. - Match sync path auth handling: conditionally set API key headers in streaming paths for both anthropic and openai providers. - Remove unconditional temperature from openai streaming payload to match sync path compatibility approach. - Add URL scheme validation to OllamaClient.__init__ to prevent unsafe local/resource access via non-http schemes. - Guard ollama streaming finalizer: raise error when stream drops without a done frame and no content was received. - Update chat.py type hint and docstring for 3-tuple completion event. - Serialize and return filtered conversation history in the complete SSE event so the client can round-trip it on follow-up turns. - Store and send conversation history from NLQChatPanel, clear on conversation reset. - Fix JSON-fallback SQL render path: clear content when SQL was extracted without fenced blocks so ChatMessage uses sql-only renderer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adding block scoping to the error case introduced an unmatched brace that prevented the switch statement from closing properly, causing an eslint parse error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Replace compaction module imports with inline history deserialization and filtering since compaction.py is on a different branch. - Add rstrip(';') to SQL extraction test to match production code, fixing double-semicolon assertion failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The rstrip(';') applied to each block before joining means single blocks and the last block in multi-block joins no longer have trailing semicolons. Update expected values to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Truncated content from a dropped connection should not be treated as a complete response, even if partial text was streamed. Always raise when final_data is None, matching CodeRabbit's recommendation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dpage changed the title ~~Fixean issue where LLM responses are not streamed or rendered properly in the AI Assistant. Fixes #9734~~ Fix an issue where LLM responses are not streamed or rendered properly in the AI Assistant. Fixes #9734 Mar 11, 2026

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

dpage force-pushed the fix_assistant_streaming_rendering branch from 04b846c to 82a9bc6 Compare March 12, 2026 10:50

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

dpage force-pushed the fix_assistant_streaming_rendering branch 2 times, most recently from cbbc025 to 2307535 Compare March 16, 2026 11:17

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

web/pgadmin/llm/chat.py Outdated Show resolved Hide resolved

web/pgadmin/llm/providers/openai.py Show resolved Hide resolved

web/pgadmin/tools/sqleditor/static/js/components/sections/NLQChatPanel.jsx Show resolved Hide resolved

dpage and others added 7 commits March 16, 2026 14:09

Fixean issue where LLM responses are not streamed or rendered properl…

edb5671

…y in the AI Assistant. Fixes pgadmin-org#9734

dpage force-pushed the fix_assistant_streaming_rendering branch from c880445 to 933a69e Compare March 16, 2026 15:02

akshay-joshi merged commit 6296016 into pgadmin-org:master Mar 17, 2026
33 checks passed

Conversation

dpage commented Mar 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dpage commented Mar 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 11, 2026 •

edited

Loading