Skip to content

CTRL+T does not expand or display model thinking for BYOK models #3354

@aosama

Description

@aosama

Describe the bug

When using a BYOK (Bring Your Own Key) model with GitHub Copilot CLI, pressing CTRL+T in the terminal does not display or toggle the model's thinking/reasoning pane. Instead, nothing happens—no output, no state change, and no visible or textual indication that the model is "thinking." This occurs only with BYOK models; with standard models, CTRL+T works as expected.

Affected version

  • Copilot CLI: 1.0.48

Steps to reproduce the behavior

  1. Configure Copilot CLI to use a BYOK (Bring Your Own Key) model — e.g., GLM5.1 from Ollama Cloud.
  2. Start a Copilot CLI session in your terminal.
  3. Run a prompt or wait for the model to enter a "thinking" or reasoning state.
  4. Press CTRL+T.
  5. Observe that there is no indication or output from Copilot CLI and the reasoning/thinking pane does not expand or appear.

Expected behavior

When using any supported model (including BYOK), pressing CTRL+T should expand or display the model's "thinking" or reasoning pane, showing what the model is currently considering or processing (same as with built-in models).

BYOK configuration details

I am using the following environment variables to configure the BYOK connection (via a wrapper script at ~/.local/bin/copilot-ollama-glm):

export COPILOT_PROVIDER_TYPE=openai
export COPILOT_PROVIDER_BASE_URL=https://ollama.com/v1/
export COPILOT_PROVIDER_WIRE_API=responses
export COPILOT_MODEL=glm-5.1:cloud
export COPILOT_PROVIDER_MAX_PROMPT_TOKENS=163840
export COPILOT_PROVIDER_MAX_OUTPUT_TOKENS=32768
export COPILOT_PROVIDER_API_KEY=<Ollama Cloud API key>

Ollama Cloud API compatibility

Ollama Cloud exposes both OpenAI-compatible and Anthropic-compatible API endpoints:

API Endpoint Purpose
OpenAI Chat Completions https://ollama.com/v1/chat/completions Primary API
OpenAI Responses https://ollama.com/v1/responses Newer Responses API (supports reasoning/thinking tokens)
Anthropic Messages https://ollama.com/v1/messages Secondary compatibility layer (for Claude Code integration)

Both specs work with GLM-5.1 on Ollama Cloud. However, the OpenAI Responses API is the correct choice for Copilot CLI BYOK because:

  1. COPILOT_PROVIDER_WIRE_API=responses is the wire format that carries reasoning/thinking data — this is exactly what CTRL+T toggles.
  2. The Anthropic-compatible endpoint is designed for tools like Claude Code, not Copilot CLI.
  3. Copilot CLI's BYOK docs explicitly support openai provider type with responses wire API.

Key observation: The BYOK provider is configured as openai provider type using the OpenAI Responses API wire format (COPILOT_PROVIDER_WIRE_API=responses), not the Anthropic Messages API. This may be relevant — the CTRL+T thinking toggle may only be wired up for native Copilot models and may not handle the OpenAI Responses API's reasoning/thinking fields for BYOK models.

Additional note: GLM-5.1 ignores prompts sent with role: "developer" — prompts must use role: "system" instead. The pi-ollama-cloud integration sets supportsDeveloperRole: false on all models to handle this.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:input-keyboardKeyboard shortcuts, keybindings, copy/paste, clipboard, mouse, and text inputarea:modelsModel selection, availability, switching, rate limits, and model-specific behavior

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions