update by limweb · Pull Request #199 · dwgx/WindsurfAPI

limweb · 2026-06-15T09:31:58Z

改了什么 / What changed

为什么 / Why

测试 / Testing

Checklist

代码风格和现有文件一致 / Code style matches existing files
没有引入 npm 依赖 / No new npm dependencies (project is zero-dep)
涉及 LS binary 协议改动时在 PR 描述里注明字段号来源 / If touching LS protocol, document field-number source in the PR description
涉及 dashboard UI 用 App.confirm / App.prompt 不用浏览器原生 alert/confirm / Uses App.confirm / App.prompt, not native dialogs (if dashboard)

- Mark 9 enum-only models as deprecated (gpt-4o-mini, deepseek-v3/r1, grok-3-mini, qwen-3, etc.) — upstream LS no longer routes them - /v1/models hides deprecated models - FREE_TIER_BASE updated to gemini-2.5-flash only - Connect-RPC tested on VPS: InitializePanelState OK but StartCascade returns empty — LS only partially supports connect protocol. Default stays grpc. - PROBE_CANARIES reduced to gemini-only — Claude canary probes were burning Trial account quotas (2-3 req/hr per model) - Version bump to 2.0.0

…ross dashboard - add tooltip keys: probeTitle, copyTitle, refreshTitle, editModels - add time unit keys: day, hour, minute, second - add tier filter label and empty state messages (noRequestData, noAccountRequestData, noData) - add batch import and proxy configuration feedback messages - add login result section keys (successTitle, addedToPool, fields.*) - expand cascadeReuse.desc with detailed explanation of cache behavior and

…robe/credits operations - check-i18n.js: add I18n.t() call extraction and validation, check both string literals and variable expressions, deduplicate and verify all keys exist in locale files - i18n: add button keys

updateCapability() ran inferTier() unconditionally after every request, which overwrites the authoritative tier set by GetUserStatus / refreshCredits. When a Pro or 14-day Trial account called a non-premium model such as gemini-2.5-flash or gpt-4o-mini, inferTier() saw only that ok=true and returned 'free', so the dashboard showed the account as FREE right after a successful request. probeAccount() already worked around this by restoring status.tierName at the end, but the chat handler path (handlers/chat.js) had no such restore, so every real API call silently demoted the tier. Skip the inferTier() fallback when an authoritative source is present (userStatusLastFetched > 0, or tierManual=true). The fallback still runs for brand-new accounts that have never been probed.

…ribute only

…d add surface background to toggle container

…a-i18n attribute only

@aict666

修复了一个实际影响用户体验的 tier 降级 bug。updateCapability 在每次 chat 后调 inferTier 会把 Pro/Trial 降为 free，现在有 userStatus 的账号跳过推断。感谢 @aict666 的精准定位。

fix: prevent cascade reuse from replaying old context

Upd/english translation

…model name

…ers for hvoy.ai

- fix(dwgx#47): remove if(emulateTools) guards — tool calls were silently dropped when condition was false. Now always preserved. Fixes Claude Code tool calls appearing as raw text. - feat: real structured output — system message + user hint + json_schema injection + response fence stripping for response_format support - feat: zero-dep PDF text extraction (src/pdf.js) — FlateDecode inflate + Tj/TJ operator parsing. Handles Anthropic type:"document" blocks. Scanned PDFs get clear error message.

… too) + add tool-calling reliability matrix + opus-4-7-max weekly quota FAQ

…calling matrix + opus-4-7-max FAQ

0a00 抓到根因 31 个 trial 账号 quota 100% 但全在 claude-opus-4-7-max 上撞 26-29 分钟 per-(account,model) 滑窗限流。upstream 对 -max / -xhigh 这种高 reasoning effort 变体单独限频跟 daily/weekly quota 独立维度。加 pickRateLimitFallback 从模型名 suffix 推同 base 低一档 (low → medium → high → xhigh → max；1m → bare； -thinking 跳过避免 silent 降级 user-visible 行为)。 non-stream + stream 429/503 错误响应加 fallback_model + remediation 字段告诉 client 能换哪个 sibling。测试 860 → 871 (+11)。

…t-4.5 / claude-opus-4.5) — README + recent replies use dotted form, catalog has dashed only

…rphan cleanup dwgx#126/dwgx#128 撞 rate limit 自动重试 fallback model handleChatCompletions 抽 outer wrapper：non-stream + rate_limit + fallback_model + env 不关 → 改 body.model 重发一次。响应 body.model 还原原始名加 served_model + fallback_reason 旁侧字段。stream 路径不接 (chunks 可能已 emit)。 env 关 WINDSURFAPI_VARIANT_FALLBACK_ON_RATE_LIMIT=0。 dwgx#127 dashboard 一键更新残留 orphan LS 两层修：self-update 退出前 stopLanguageServer 优雅关；startup 早期 cleanupOrphanLanguageServers ps scan 兜底 kill 不在自己 pool 的 langserver_linux_x64 进程。 env 关 WINDSURFAPI_SKIP_LS_CLEANUP=1。测试 871 → 882 (+11)。

v2.0.85 默认 ON auto-fallback 副作用：cascade reuse fingerprint 含 modelKey 改 model → 新 cascade 存到新 model 下 → client 下次拿原 model 请求 reuse miss → 依赖 cascade reuse 的 client （不重发 history 那种）看到模型失忆。 shouldAutoFallback env gate 反向默认 OFF WINDSURFAPI_VARIANT_FALLBACK_ON_RATE_LIMIT=1 才显式开。 fallback_model + remediation 字段 (v2.0.84) 保留。 v2.0.87+ 计划做 cascade pool alias 写入让 fallback 不破 reuse 那时再考虑默认 ON。测试 882/0。

v2.0.86 hotfix 默认 OFF 是临时退步，这版做 v2.0.86 承诺的真修。 A. conversation-pool checkin 接 string | string[] 同时写多 fp slot B. chat.js inner 接 context.__aliasModelKey 算 alt fingerprintAfter 一起 checkin 列表 C. outer wrapper fallback 时设 __aliasModelKey: originalModel 让 xhigh 跑出来的 cascade 同时挂在 max 和 xhigh 两个 fingerprint D. shouldAutoFallback 默认改回 ON (env=0 仍可关) 效果：client max 撞限流 → fallback xhigh 跑 → cascade 双 index → client 下一 turn 用 max checkout 命中 → 模型继续 history 不失忆。测试 882 → 889 (+7 含 realistic auto-fallback 场景测试)。

H-1 outer wrapper 算 originalRoutingKey = resolveModel(mergeReasoningEffortIntoModel(...)) 作为 __aliasModelKey 不再用 raw body.model (codex CLI 类 reasoning_effort split 客户端 v2.0.87 fix 失效) H-2 invalidateFor 两遍 scan 第一遍 collect cascadeId 第二遍删 sibling slot 任何指它的 (LS 重启时 alias slot 不再 dangling) H-3 outer wrapper 算 originalCkey 透传 inner cacheSet 写 alias key (rate-limit 窗口下次同 prompt 命中 cache 不再烧 quota) H-4 新 stopLanguageServerAndWait 等 child exit (SIGKILL fallback) self-update callback await 它 (避免 orphan LS 占端口 race) M-1 stats.stores 一次 per checkin 不 per slot 加 aliasWrites M-5 served_model/fallback_reason 移 usage.cascade_breakdown (避 pydantic strict client ValidationError) L-1 cleanup argv0 严格相等不 substring (避免 grep 等被 SIGTERM) 测试 889 → 898 (+9 H-1 H-2 H-4 M-1 真修测试)

codex 二审 v2.0.88 4 HIGH 修法本身。H-2/3/4 准。H-1 留 latent hole: alias fpAfter 用 fpOpts.toolPreamble 是 fallback dialect 算的，跨 provider fallback 会让下次 fpBefore 重算 toolPreamble 不一致 → fingerprint mismatch → reuse miss → 模型失忆。今天 pickRateLimitFallback 都同 provider ladder 走不触发，但 catalog 任何扩展会 silent 破。加 _isSameProviderFallback 硬 guard 锁不变量。测试 898 → 904 (+6)

lnqdev 在 v2.0.89 又报 ERR_TOKEN_FETCH_FAILED。e2e 实测 3 账号 × 4 OTT host 组合 = 12/12 全 401 invalid_token。Auth1 + PostAuth 链路前段都 200 OK 给 sessionToken 但任何 sessionToken 在任何 OTT host 都被拒。v2.0.61 / v2.0.75 / v2.0.79 三层 fallback 救不了 — 是上游 GetOneTimeAuthToken 端点本身废了。反向工程 windsurf-assistant v17.42.20 (2026-04-27 仍活跃) 走 Devin-only: Auth1 → PostAuth → sessionToken 直接当 IDE auth credential，跳过 OTT + RegisterUser。实测 Cascade gRPC backend (server.codeium.com / server.self-serve.windsurf.com) 接受裸 sessionToken 当 metadata.apiKey 4/4 200 OK 拿到 planName=Trial。修法：windsurfLoginViaAuth1 链路从 Auth1 → PostAuth → OTT → registerWithCodeium → apiKey (sk-ws-01-) 塌成 Auth1 → PostAuth → apiKey = sessionToken。删 60 行加 10 行净减 20 行。 Firebase signInWithPassword 现在被 Google App Check 挡了 server-side 也走不通，dispatcher 实际 fall through 到 Auth1 路径，所以邮箱密码登录功能完整恢复。测试 904 → 910 (+6 source-level invariants v2090-ott-bypass.test.js) 全测 0 fail / 0 回归。升级 docker compose pull && docker compose up -d --force-recreate

…l-audit v2.0.88-90 - client.js: cascadeHistoryBudget default 200k→400k, add truncation note for trimmed history so model doesn't ask user to repeat - handlers/chat.js: add IP-rate-limit circuit breaker for non-stream and stream paths, record policy blocked + rate limited events - handlers/messages.js: defensive startMessage() in finish() prevents event ordering violation when message stops before it starts - dashboard/stats.js: track policyBlockedCount and rateLimitedCount, persist to stats.json for dashboard visibility

…ling - Frontend saveGlobalProxy/editAccountProxy now checks API error response before showing success toast (fixes silent failure on ERR_PROXY_PRIVATE_HOST) - parseProxyUrl normalizes whitespace and supports space-separated format like "socks5 127.0.0.1 1089" in addition to canonical URL form - setGlobalProxy/setAccountProxy auto-trim proxy host values

The kimi_k2 vLLM section-token dialect stopped working for kimi-k2 and kimi-k2-thinking — Cascade returns ~16 token empty responses instead of tool calls. Switching to openai_json_xml which works for all Moonshot models. Verified on VPS: kimi-k2 with openai_json_xml correctly emits <tool_call> XML tool calls.

Verified on VPS (2026-05-07): kimi-k2 through Cascade returns idle_empty (12-24 tokens, empty content) even for plain chat without tools. This is an upstream Windsurf model outage, not a proxy bug. The proxy cannot fix Cascade silently returning empty for a model.

streamResponse was referencing bare `context` for the IP-level rate-limit circuit breaker, but `context` was never passed to the function. Now passes `context` through the `deps` parameter and accesses it as `deps.context`, matching the pattern used by other shared state (cachePolicy, tools, etc.). Fixes: all Claude models crashing with "Handler error: ReferenceError: context is not defined at chat.js:3223"

Windsurf PostAuth endpoint now requires the auth1 token as both a header (X-Devin-Auth1-Token) and in the request body. Previously only sent in body, causing "missing required header: X-Devin-Auth1-Token" error on the legacy server.self-serve.windsurf.com endpoint. Also made postAuthDualPath accept optional extraHeaders parameter for future-proofing.

@suhaihui-git

Cherry-picked from PR dwgx#142 by @suhaihui-git — adds isCacheEnabled() guard to cacheGet/cacheSet and exposes enabled field in cacheStats(). Defaults to enabled (backward-compatible).

@Await-d

…#144 by @Await-d) Upstream PostAuth now expects: - Empty application/proto body instead of JSON bridge - X-Devin-Auth1-Token header + Referer - Raw response parsing for devin-session-token Co-authored-by: Await-d

@Await-d

…o body + dwgx#137 proxy parse + cache switch - P0: ReferenceError context is not defined in streamResponse (dwgx#135) - PostAuth empty proto body + X-Devin-Auth1-Token + Referer (dwgx#134 via @Await-d PR dwgx#144) - parseProxyUrl whitespace + frontend error check (dwgx#137) - RESPONSE_CACHE_ENABLED env (PR dwgx#142 by @suhaihui-git) - IP rate-limit circuit breaker (#132) - CLAUDE.md for agent rules

…600KB - NLU retry now defaults ON for zhipu/glm/moonshot/kimi models (they narrate instead of calling tools on first pass). Still requires narrative detection to trigger. Set WINDSURFAPI_NLU_RETRY=0 to disable. (dwgx#147, dwgx#125, dwgx#120) - CASCADE_MAX_HISTORY_BYTES default 400KB -> 600KB to reduce silent context amputation in long tool-call conversations (dwgx#133)

- kimi-k2 idle_empty now returns clear 502 with model suggestions instead of silently passing null/empty content to caller (dwgx#125/dwgx#120) - expanded neutralization patterns for Windsurf content policy filter (prompt-injection, jailbreak, bypass/safety override keywords) (dwgx#130) - Devin session token + brand name normalization already in place

When Cascade returns idle_empty for kimi-k2 (null/empty content), return clear error with suggested alternative models instead of silently passing empty output to the caller.

Moved upstream_model_unavailable check to right before final message assembly where allText/allThinking values are definitive. Returns clear 502 error when kimi-k2 produces null/empty content.

- kimi-k2 idle_empty returns 502 with model suggestions - GLM/Kimi NLU retry auto-enabled - history budget 400KB→600KB - content policy bypass hardening - v2.0.91 fixes included: dwgx#135 context crash, dwgx#134 Auth1, dwgx#137 proxy parse, #132 circuit breaker

POST /auth/login now accepts optional proxy field for both single and batch account creation. Proxy is parsed and bound to the newly created account as per-account proxy. Single: {"token":"...", "proxy":"socks5://1.2.3.4:1080"} Batch: {"accounts":[{"token":"t1","proxy":"http://p:8080"},...]} Verified on VPS — account created with proxy bound in proxy.json.

- VPS credentials and SSH_ASKPASS trick - issue reply rules: test first, no version numbers, short Chinese - updated defaults: NLU retry auto ON, history 600KB, maxWait 600s - release process and common issue triage

dwgx and others added 30 commits April 24, 2026 05:48

docs: clarify Cascade routing comment for deprecated models

b2a4238

Update README.md

6b0a2d2

docs: remove xyiqq credit line

e5111e8

fix: probe only gemini canaries — Claude burns Trial quota too fast

e36cae3

feat: dark immersive landing page with protocol conversion animation

9692137

Merge branch 'dwgx:master' into upd/english-translation

0109a56

i18n: remove I18n.t() call from cascadeReuse.desc — use data-i18n att…

0b940bb

…ribute only

fix: adjust cascadeReuse section layout — remove extra closing div an…

30cf2de

…d add surface background to toggle container

i18n: remove I18n.t() call from identityPrompt.templateHint — use dat…

696a791

…a-i18n attribute only

fix: prevent cascade reuse replay from duplicating context

8b82045

Merge pull request dwgx#45 from baily-zhang/fix-cascade-reuse-offsets

af8d1ad

fix: prevent cascade reuse from replaying old context

Merge pull request dwgx#43 from smeinecke/upd/english-translation

e5c135c

Upd/english translation

docs: add baily-zhang, aict666, smeinecke to contributors

b64ed1d

revert: restore previous landing page design

bb2c081

feat: inject model identity context for hvoy.ai verification scoring

5b69fb9

fix: destructure displayModel from opts in cascadeChat

fdf8672

fix: pass displayModel as model param to cascadeChat

77a3bd2

feat: neutralize Cascade identity in responses — replace with actual …

ed004b5

…model name

feat: model identity & quality test script for hvoy.ai prep

1088c7d

fix: structured output + thinking auto-upgrade + model signature head…

be75140

…ers for hvoy.ai

fix(dwgx#49): show LS stderr as warn + diagnostic hints on exit code 1

899eb91

fix: quota labels 日/周/提示词 + 6 contributors added to dashboard credits

d3e241d

fix: clean quota bars — short D/W/P labels + tooltip details, no clutter

056c362

fix: wantJson scope + i18n quota short labels with tooltip

208d4ee

dwgx and others added 29 commits May 3, 2026 23:08

docs(readme): correct free-account model list (GLM/Kimi/Qwen entitled…

703a064

… too) + add tool-calling reliability matrix + opus-4-7-max weekly quota FAQ

docs(readme.en): mirror Chinese README — free-tier model list + tool-…

931fde2

…calling matrix + opus-4-7-max FAQ

fix(models): add dotted-form aliases (claude-haiku-4.5 / claude-sonne…

de32c59

…t-4.5 / claude-opus-4.5) — README + recent replies use dotted form, catalog has dashed only

feat: add RESPONSE_CACHE_ENABLED env to disable response cache

ee244d9

Cherry-picked from PR dwgx#142 by @suhaihui-git — adds isCacheEnabled() guard to cacheGet/cacheSet and exposes enabled field in cacheStats(). Defaults to enabled (backward-compatible).

chore: allow CLAUDE.md in repo for agent rules

57f6136

chore: gitignore tmp-testing/

8da5875

fix: kimi-k2 upstream outage returns 502 with model suggestions

90b6a6e

When Cascade returns idle_empty for kimi-k2 (null/empty content), return clear error with suggested alternative models instead of silently passing empty output to the caller.

fix: kimi-k2 empty response detection at message assembly

6e60a5b

Moved upstream_model_unavailable check to right before final message assembly where allText/allThinking values are definitive. Returns clear 502 error when kimi-k2 produces null/empty content.

release: 2.0.92

888bc44

- kimi-k2 idle_empty returns 502 with model suggestions - GLM/Kimi NLU retry auto-enabled - history budget 400KB→600KB - content policy bypass hardening - v2.0.91 fixes included: dwgx#135 context crash, dwgx#134 Auth1, dwgx#137 proxy parse, #132 circuit breaker

fix: remove unused import + warn on invalid proxy format in /auth/login

9af6bd0

docs: update CLAUDE.md with session knowledge

8a0c598

- VPS credentials and SSH_ASKPASS trick - issue reply rules: test first, no version numbers, short Chinese - updated defaults: NLU retry auto ON, history 600KB, maxWait 600s - release process and common issue triage

limweb closed this Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update#199

update#199
limweb wants to merge 330 commits into
dwgx:masterfrom
limweb:master

limweb commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

limweb commented Jun 15, 2026

改了什么 / What changed

为什么 / Why

测试 / Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants