feat: add Z.AI (Zhipu AI) provider support#74
feat: add Z.AI (Zhipu AI) provider support#74vinit13792 wants to merge 3 commits intorepowise-dev:mainfrom
Conversation
- Add litellm to interactive provider selection menu - Support LITELLM_BASE_URL for local proxy deployments (no API key required) - Auto-add openai/ prefix when using api_base for proper LiteLLM routing - Add dummy API key for local proxies (OpenAI SDK requirement) - Add validation and tests for litellm provider configuration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… false positives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add first-class support for Z.AI with OpenAI-compatible API. - New ZAIProvider with thinking disabled by default for GLM-5 family - Plan selection: 'coding' (subscription) or 'general' (pay-as-you-go) - Environment variables: ZAI_API_KEY, ZAI_PLAN, ZAI_BASE_URL, ZAI_THINKING - Rate limit defaults and auto-detection in CLI helpers Closes repowise-dev#68
|
Thanks for picking this up. I filed #68 and have been testing against the Z.AI API directly -- a few observations. Rate limits are unverified. The 60 RPM / 150K TPM defaults are copied from the litellm entry. I'm currently working with Z.AI to get actual per-plan limits -- their rate limiting behavior under concurrent load is one of the open questions blocking my own PR attempt. These defaults may be fine as a placeholder, but worth a Thinking toggle is Z.AI-specific. @RaghavChamadiya mentioned wanting a generic mechanism. From my testing across providers, every one handles this differently (Z.AI uses I'm still dialing in some Z.AI-specific behavior (rate limits under concurrency, thinking toggle edge cases) and will share data as it comes in. |
|
Quick update since my last comment -- I heard back from Z.AI support with specifics on concurrency limits per tier and have submitted a follow-up PR (#80) that implements tier-aware rate limiting. Key findings from Z.AI support:
PR #80 includes Happy to split out just the tier changes if the maintainers prefer it as a stacked PR on top of this one instead. |
Summary
coding(subscription) orgeneral(pay-as-you-go)ZAI_API_KEY,ZAI_PLAN,ZAI_BASE_URL,ZAI_THINKINGUsage
Test Plan
Closes #68