feat(assistant): AI 接口三层防刷护甲 — 前端约束 + Upstash 限流 + 上游错误兜底#295
Merged
Conversation
GLM-4.6V-Flash 免费版并发极低(≈5),单用户开几个 tab 就能把整个站点
的 AI 打爆。本 PR 加入分层防护:
**L1 前端硬约束(thread.tsx)**
- 输入框 maxLength={4000}:从源头挡掉 token 炸弹
**L2 服务端 rate limit(lib/rate-limit.ts + /api/chat + /api/suggestions)**
- 新增 Upstash Redis(Serverless Redis over HTTP)
- per-IP 滑动窗口:
- 纯文本 10 req / 60s
- 带图 5 req / 60s(图片 token 成本 5~10×)
- 日限 100 req / 24h / IP
- 两个 API 共用同一 IP 额度池(都打 LLM)
- 429 响应带标准 Retry-After / X-RateLimit-* 头
- Upstash env 未配置时降级为"放行 + warn",本地开发零配置,生产漏配
不会让接口彻底挂
**L3 GLM 上游错误兜底(/api/chat + DocsAssistant.tsx)**
- mapUpstreamError 识别智谱业务码:
- 1302 / HTTP 429 → "AI 服务被挤爆了,排队中"
- 1113 → "免费额度已用完,切自有 Key 或明天再来"
- 1001/1002/1003 → "密钥配置异常,管理员已收到通知"
- deriveAssistantError 优先透传服务端中文友好文案,不再被默认英文
"provider is rate limiting" 覆盖
**部署须知**
Vercel env 需补 UPSTASH_REDIS_REST_URL / UPSTASH_REDIS_REST_TOKEN,
推荐走 Vercel Integrations → Upstash 一键绑定。
Refs #285
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a three-layer anti-abuse “armor” for AI endpoints to protect low-concurrency/free-tier LLM usage: a frontend input cap, server-side distributed rate limiting (Upstash), and upstream (GLM) error mapping for friendlier client messages.
Changes:
- Add Upstash-based per-IP rate limiting utilities and apply them to
/api/chatand/api/suggestions. - Add upstream error mapping in chat API and prioritize server-provided friendly messages in the Docs assistant UI.
- Add a frontend hard
maxLengthlimit to the assistant message composer and document new env vars.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
package.json |
Adds @upstash/ratelimit + @upstash/redis dependencies. |
pnpm-lock.yaml |
Locks new Upstash-related dependencies. |
lib/rate-limit.ts |
New shared rate limit utility (limitChat, rateLimitResponse). |
app/api/chat/route.ts |
Applies rate limiting and adds GLM upstream error mapping. |
app/api/suggestions/route.ts |
Applies rate limiting to suggestions endpoint. |
app/components/DocsAssistant.tsx |
Prefers server-friendly messages for 429/5xx errors. |
app/components/assistant-ui/thread.tsx |
Adds maxLength={4000} to message input. |
.env.sample |
Documents required Upstash env vars. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot 提了 6 条 + CodeQL 2 条正则告警,全部修复: **lib/rate-limit.ts** - 文档头 usage 示例 API 改对(CR #1) - getClientIp 防伪造(CR #2,**安全修复**): 优先 x-real-ip(Vercel 等 CDN 写的是可信值);降级用 XFF 时取**最后一个** 而非首个,避免客户端伪造 `x-forwarded-for: fakeip` 绕过 rate-limit - Upstash 缺失 warn 改用 module-scoped flag,整个实例生命周期只打一次, 不再按 NODE_ENV 区分 —— dev 也得看到提示(CR #3) **app/api/chat/route.ts** - POST 入口预读 body 判定 hasImage,true 时触发 5 req/60s 严限流; 预读失败不阻塞,保持原有容错(CR #4) - 新增 messagesHaveImage helper:识别 type=image / image_url / file+image 媒体 - mapUpstreamError 不再把 err.stack 拼进匹配文本:stack 里的 `:429:` 行号 会误触发 rate-limited 分类(CR #5,**真实 bug**) - JSON.stringify 加 try/catch 兜底 String(err),避免循环引用再抛错(CR #6) - 所有业务码正则里的 `.*` 改成 `[^\s]{0,10}?`,限死回溯深度防 ReDoS (CodeQL polynomial regex 告警)
Vercel 的 Upstash 集成在不同 prefix 设置下会生成不同命名: - 无 prefix → KV_REST_API_URL / _TOKEN - prefix=X → X_KV_REST_API_URL / _TOKEN (prefix 被前置而非替换) 手动从 Upstash 控制台复制则是 UPSTASH_REDIS_REST_URL / _TOKEN。 新增 firstEnv() helper 按优先级依次探查,读到谁用谁: 1. UPSTASH_REDIS_REST_URL (手动配) 2. UPSTASH_REDIS_REST_KV_REST_API_URL (Vercel + 自定义 prefix) 3. KV_REST_API_URL (Vercel + 无 prefix) .env.sample 同步说明三种命名。
This was referenced Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
动机
免费模型 GLM-4.6V-Flash 并发上限 ≈ 5,单用户开几个 tab 就能把整个站点的 AI 打爆(智谱官方文档未明确披露 RPM/日配额,但 GLM-4-Flash 的并发上限是 5,4.6V 大概率同级)。在做图片输入功能前先把防刷做了。
改动
L1 前端硬约束
app/components/assistant-ui/thread.tsxmaxLength={4000}:HTML 原生,从源头挡 token 炸弹L2 服务端 rate limit
lib/rate-limit.ts(新增)@upstash/ratelimit+@upstash/redisRetry-After/X-RateLimit-*头app/api/chat/route.ts和app/api/suggestions/route.tslimitChat(req),命中立即 429 返回L3 GLM 上游错误兜底
app/api/chat/route.ts加mapUpstreamError()识别智谱业务码并返结构化
{ error, code }:app/components/DocsAssistant.tsxderiveAssistantErrorVercel env 需补两个:
```
UPSTASH_REDIS_REST_URL=
UPSTASH_REDIS_REST_TOKEN=
```
推荐方式:Vercel Project → Integrations → Upstash → 一键绑定,免费 tier 10K 命令/天够用。
未配置会打 warn 但不阻塞请求 —— 失守但不崩。
测试
配套后续
图片输入功能(task #7)在本 PR 合入后单独开 PR,会复用这里的
hasImage维度限流。