feat(providers): add "local/" routing prefix for OpenAI-compatible local servers by np6126 · Pull Request #3044 · ultraworkers/claw-code

np6126 · 2026-05-17T08:26:55Z

Summary

A common deployment is running an OpenAI-compatible inference server locally — Ollama, LM Studio, vLLM, llama.cpp — and routing claw-code's OpenAI provider client at it via OPENAI_BASE_URL.

The obvious model id formats actively misroute today:

qwen/qwen3:14b or qwen3-coder → metadata_for_model maps the qwen family to DashScope (DASHSCOPE_API_KEY, dashscope.aliyuncs.com). A user serving Qwen3 locally on Ollama is silently routed to Alibaba's cloud.
kimi/kimi-k2.5 → same DashScope routing.
grok-* → routes to xAI.

The existing openai/ prefix would work as a workaround, but mentally conflates "this is my local Qwen3" with "the OpenAI API itself", and a user who does want the existing OpenRouter behaviour (slug preserved on the wire for openai/...) can't have it both ways.

This PR adds local/ as an explicit routing prefix that says "this is a local OpenAI-compatible server, route as OpenAI client, strip the prefix on the wire". Routing semantics are identical to openai/: OpenAi provider, OPENAI_API_KEY (typically an unused placeholder for local servers), OPENAI_BASE_URL.

Diff

// providers/mod.rs
-if canonical.starts_with("openai/") || canonical.starts_with("gpt-") {
+if canonical.starts_with("openai/")
+    || canonical.starts_with("local/")
+    || canonical.starts_with("gpt-")
+{
     return Some(ProviderMetadata { provider: ProviderKind::OpenAi, ... });
 }

// providers/openai_compat.rs (wire_model_for_base_url)
-if matches!(lowered_prefix.as_str(), "xai" | "grok" | "qwen" | "kimi") {
+if matches!(lowered_prefix.as_str(), "local" | "xai" | "grok" | "qwen" | "kimi") {
     return Cow::Borrowed(&model[pos + 1..]);
 }

Two places only — both pure additions.

No change to strip_routing_prefix — it's #[allow(dead_code)] and only referenced from its own unit tests; adding local there would be cosmetic.

No change to the OpenAI gateway slug preservation logic used by OpenRouter and similar gateways with a non-default OPENAI_BASE_URL — openai/... behaviour stays exactly as upstream.

Example

export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_API_KEY=ollama-placeholder
claw --model local/qwen3:14b "..."

Wire request: POST $OPENAI_BASE_URL/chat/completions with model=qwen3:14b. No misrouting to DashScope.

Test plan

cargo check --workspace clean
Existing provider tests: cargo test -p api
Manual: route local/qwen3:14b against a local Ollama, verify the request goes to the local server with bare model id on the wire

Companion discussion

A separate issue (#3045) covers two related open questions: <think>...</think> block filtering in streamed output, and whether the openai/-slug preservation logic should remain non-configurable once local/ is available. Posted separately because both involve tradeoffs upstream may want to weigh in on before code lands.

…cal servers A common use case is running an OpenAI-compatible inference server locally — Ollama, LM Studio, vLLM, llama.cpp — and routing to it through the OpenAI provider client. The existing "openai/" prefix would work, but mentally conflates "local Ollama / vLLM" with "the OpenAI API itself", and the more obvious model id formats actively misroute: - "qwen/qwen3:14b" or "qwen3-coder" → metadata_for_model maps the "qwen" family to DashScope (DASHSCOPE_API_KEY, dashscope.aliyuncs.com). A user serving Qwen3 locally on Ollama gets routed to Alibaba's cloud instead of their local server. - "kimi/kimi-k2.5" → same DashScope routing. - "grok-..." → routes to xAI. None of these can express "this is a local copy of that family, route as OpenAI-compatible to my local server" without forcibly setting OPENAI_BASE_URL plus the "openai/" prefix — which is unintuitive. The "local/" prefix solves it: operators express intent ("this is a local inference server, route as OpenAI client") without naming collision. Routing semantics are identical to "openai/": OpenAi provider, OPENAI_API_KEY auth env (typically an unused placeholder for local servers), OPENAI_BASE_URL. Two places updated: - metadata_for_model: recognise "local/" alongside "openai/" - wire_model_for_base_url: strip "local/" prefix on the wire No change to strip_routing_prefix (#[allow(dead_code)], tests only) and no change to the OpenAI gateway slug preservation logic (used by OpenRouter and similar gateways with non-default OPENAI_BASE_URL) — that behaviour for "openai/" stays exactly as upstream.

…am-drop The two patches we ship in bootc/patches/ each contained an experimental mix of changes that aren't all needed in this setup. Splitting them into intent-aligned files makes it possible to drop them individually as upstream merges happen, without having to surgically extract pieces. - claw-fix-stream-newlines.patch: now contains only the trailing-newline restoration in MarkdownStreamState::push. Matches the change in ultraworkers/claw-code#3043 1:1, so the file drops out the moment that PR lands and CLAW_CODE_REF is bumped. - claw-fix-openai-prefix-strip.patch: now contains only the "local/" routing-prefix additions in metadata_for_model and wire_model_for_base_url. The dead-code edit to strip_routing_prefix and the OpenRouter slug-preservation removal are gone — both were experimental and unneeded for this setup (we use "local/" prefix, not "openai/"). Matches ultraworkers/claw-code#3044 1:1, drops out when that PR lands. - claw-add-think-block-filter.patch (new): isolates the <think>...</think> block filtering for thinking models (Qwen3 et al). Applied on top of the newlines patch since both touch MarkdownStreamState::push. Stays as a local patch until ultraworkers/claw-code#3045 resolves; the filter remains usable as a standalone patch against post-#3043 upstream because its context lines reference the newlines-applied state. Containerfile applies the three patches in order: newlines → think → local-prefix. Verified: all three apply cleanly to the pinned CLAW_CODE_REF and the resulting tree compiles (cargo check --workspace).

shakoorshkh · 2026-05-17T09:04:24Z

Model id formats like qwen/qwen3:14b silently misroute to DashScope when the user is running a local Ollama server. No error — wrong destination.

Adds a new prefix convention (local/) that users must learn. If OPENAI_BASE_URL is not set, local/ routes to OpenAI with a stripped model id, which will likely 404.

Explicit prefix over inference. Users targeting local servers now carry that intent in the model string rather than relying on routing heuristics to guess correctly.

np6126 mentioned this pull request May 17, 2026

Two open behaviour questions for local-server / streaming UX (think-block filter, openai/-slug preservation) #3045

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): add "local/" routing prefix for OpenAI-compatible local servers#3044

feat(providers): add "local/" routing prefix for OpenAI-compatible local servers#3044
np6126 wants to merge 1 commit into
ultraworkers:mainfrom
np6126:feat/local-routing-prefix

np6126 commented May 17, 2026 •

edited

Loading

Uh oh!

shakoorshkh commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

np6126 commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Diff

Example

Test plan

Companion discussion

Uh oh!

shakoorshkh commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

np6126 commented May 17, 2026 •

edited

Loading