Skip to content

update#199

Closed
limweb wants to merge 330 commits into
dwgx:masterfrom
limweb:master
Closed

update#199
limweb wants to merge 330 commits into
dwgx:masterfrom
limweb:master

Conversation

@limweb

@limweb limweb commented Jun 15, 2026

Copy link
Copy Markdown

改了什么 / What changed

为什么 / Why

测试 / Testing

Checklist

  • 代码风格和现有文件一致 / Code style matches existing files
  • 没有引入 npm 依赖 / No new npm dependencies (project is zero-dep)
  • 涉及 LS binary 协议改动时 在 PR 描述里注明字段号来源 / If touching LS protocol, document field-number source in the PR description
  • 涉及 dashboard UI 用 App.confirm / App.prompt 不用浏览器原生 alert/confirm / Uses App.confirm / App.prompt, not native dialogs (if dashboard)

dwgx and others added 30 commits April 24, 2026 05:48
- Mark 9 enum-only models as deprecated (gpt-4o-mini, deepseek-v3/r1,
  grok-3-mini, qwen-3, etc.) — upstream LS no longer routes them
- /v1/models hides deprecated models
- FREE_TIER_BASE updated to gemini-2.5-flash only
- Connect-RPC tested on VPS: InitializePanelState OK but StartCascade
  returns empty — LS only partially supports connect protocol. Default
  stays grpc.
- PROBE_CANARIES reduced to gemini-only — Claude canary probes were
  burning Trial account quotas (2-3 req/hr per model)
- Version bump to 2.0.0
…ross dashboard

- add tooltip keys: probeTitle, copyTitle, refreshTitle, editModels
- add time unit keys: day, hour, minute, second
- add tier filter label and empty state messages (noRequestData, noAccountRequestData, noData)
- add batch import and proxy configuration feedback messages
- add login result section keys (successTitle, addedToPool, fields.*)
- expand cascadeReuse.desc with detailed explanation of cache behavior and
…robe/credits operations

- check-i18n.js: add I18n.t() call extraction and validation, check both string literals and variable expressions, deduplicate and verify all keys exist in locale files
- i18n: add button keys
updateCapability() ran inferTier() unconditionally after every request,
which overwrites the authoritative tier set by GetUserStatus /
refreshCredits. When a Pro or 14-day Trial account called a non-premium
model such as gemini-2.5-flash or gpt-4o-mini, inferTier() saw only that
ok=true and returned 'free', so the dashboard showed the account as FREE
right after a successful request.

probeAccount() already worked around this by restoring status.tierName
at the end, but the chat handler path (handlers/chat.js) had no such
restore, so every real API call silently demoted the tier.

Skip the inferTier() fallback when an authoritative source is present
(userStatusLastFetched > 0, or tierManual=true). The fallback still runs
for brand-new accounts that have never been probed.
修复了一个实际影响用户体验的 tier 降级 bug。updateCapability 在每次 chat 后调 inferTier 会把 Pro/Trial 降为 free,现在有 userStatus 的账号跳过推断。感谢 @aict666 的精准定位。
fix: prevent cascade reuse from replaying old context
- fix(dwgx#47): remove if(emulateTools) guards — tool calls were silently
  dropped when condition was false. Now always preserved. Fixes Claude
  Code tool calls appearing as raw text.

- feat: real structured output — system message + user hint + json_schema
  injection + response fence stripping for response_format support

- feat: zero-dep PDF text extraction (src/pdf.js) — FlateDecode inflate
  + Tj/TJ operator parsing. Handles Anthropic type:"document" blocks.
  Scanned PDFs get clear error message.
dwgx and others added 29 commits May 3, 2026 23:08
… too) + add tool-calling reliability matrix + opus-4-7-max weekly quota FAQ
0a00 抓到根因 31 个 trial 账号 quota 100% 但全在
claude-opus-4-7-max 上撞 26-29 分钟 per-(account,model)
滑窗限流。upstream 对 -max / -xhigh 这种高 reasoning
effort 变体单独限频跟 daily/weekly quota 独立维度。

加 pickRateLimitFallback 从模型名 suffix 推同 base 低一档
(low → medium → high → xhigh → max;1m → bare;
-thinking 跳过避免 silent 降级 user-visible 行为)。

non-stream + stream 429/503 错误响应加 fallback_model
+ remediation 字段告诉 client 能换哪个 sibling。

测试 860 → 871 (+11)。
…t-4.5 / claude-opus-4.5) — README + recent replies use dotted form, catalog has dashed only
…rphan cleanup

dwgx#126/dwgx#128 撞 rate limit 自动重试 fallback model
  handleChatCompletions 抽 outer wrapper:non-stream + rate_limit
  + fallback_model + env 不关 → 改 body.model 重发一次。响应
  body.model 还原原始名加 served_model + fallback_reason 旁侧
  字段。stream 路径不接 (chunks 可能已 emit)。
  env 关 WINDSURFAPI_VARIANT_FALLBACK_ON_RATE_LIMIT=0。

dwgx#127 dashboard 一键更新残留 orphan LS
  两层修:self-update 退出前 stopLanguageServer 优雅关;startup
  早期 cleanupOrphanLanguageServers ps scan 兜底 kill 不在自己
  pool 的 langserver_linux_x64 进程。
  env 关 WINDSURFAPI_SKIP_LS_CLEANUP=1。

测试 871 → 882 (+11)。
v2.0.85 默认 ON auto-fallback 副作用:cascade reuse fingerprint
含 modelKey 改 model → 新 cascade 存到新 model 下 → client 下
次拿原 model 请求 reuse miss → 依赖 cascade reuse 的 client
(不重发 history 那种)看到模型失忆。

shouldAutoFallback env gate 反向 默认 OFF
WINDSURFAPI_VARIANT_FALLBACK_ON_RATE_LIMIT=1 才显式开。
fallback_model + remediation 字段 (v2.0.84) 保留。

v2.0.87+ 计划做 cascade pool alias 写入让 fallback 不破
reuse 那时再考虑默认 ON。

测试 882/0。
v2.0.86 hotfix 默认 OFF 是临时退步,这版做 v2.0.86 承诺的真修。

A. conversation-pool checkin 接 string | string[] 同时写多 fp slot
B. chat.js inner 接 context.__aliasModelKey 算 alt fingerprintAfter
   一起 checkin 列表
C. outer wrapper fallback 时设 __aliasModelKey: originalModel 让
   xhigh 跑出来的 cascade 同时挂在 max 和 xhigh 两个 fingerprint
D. shouldAutoFallback 默认改回 ON (env=0 仍可关)

效果:client max 撞限流 → fallback xhigh 跑 → cascade 双 index →
client 下一 turn 用 max checkout 命中 → 模型继续 history 不失忆。

测试 882 → 889 (+7 含 realistic auto-fallback 场景测试)。
H-1 outer wrapper 算 originalRoutingKey =
   resolveModel(mergeReasoningEffortIntoModel(...)) 作为
   __aliasModelKey 不再用 raw body.model
   (codex CLI 类 reasoning_effort split 客户端 v2.0.87 fix 失效)

H-2 invalidateFor 两遍 scan 第一遍 collect cascadeId 第二遍删
   sibling slot 任何指它的 (LS 重启时 alias slot 不再 dangling)

H-3 outer wrapper 算 originalCkey 透传 inner cacheSet 写 alias
   key (rate-limit 窗口下次同 prompt 命中 cache 不再烧 quota)

H-4 新 stopLanguageServerAndWait 等 child exit (SIGKILL fallback)
   self-update callback await 它 (避免 orphan LS 占端口 race)

M-1 stats.stores 一次 per checkin 不 per slot 加 aliasWrites
M-5 served_model/fallback_reason 移 usage.cascade_breakdown
   (避 pydantic strict client ValidationError)
L-1 cleanup argv0 严格相等不 substring (避免 grep 等被 SIGTERM)

测试 889 → 898 (+9 H-1 H-2 H-4 M-1 真修测试)
codex 二审 v2.0.88 4 HIGH 修法本身。H-2/3/4 准。H-1 留
latent hole: alias fpAfter 用 fpOpts.toolPreamble 是 fallback
dialect 算的,跨 provider fallback 会让下次 fpBefore 重算
toolPreamble 不一致 → fingerprint mismatch → reuse miss → 模型
失忆。

今天 pickRateLimitFallback 都同 provider ladder 走 不触发,
但 catalog 任何扩展会 silent 破。加 _isSameProviderFallback
硬 guard 锁不变量。

测试 898 → 904 (+6)
lnqdev 在 v2.0.89 又报 ERR_TOKEN_FETCH_FAILED。e2e 实测 3 账号 × 4
OTT host 组合 = 12/12 全 401 invalid_token。Auth1 + PostAuth 链路前段
都 200 OK 给 sessionToken 但任何 sessionToken 在任何 OTT host 都被
拒。v2.0.61 / v2.0.75 / v2.0.79 三层 fallback 救不了 — 是上游
GetOneTimeAuthToken 端点本身废了。

反向工程 windsurf-assistant v17.42.20 (2026-04-27 仍活跃) 走 Devin-only:
Auth1 → PostAuth → sessionToken 直接当 IDE auth credential,跳过 OTT
+ RegisterUser。实测 Cascade gRPC backend (server.codeium.com /
server.self-serve.windsurf.com) 接受裸 sessionToken 当 metadata.apiKey
4/4 200 OK 拿到 planName=Trial。

修法:windsurfLoginViaAuth1 链路从 Auth1 → PostAuth → OTT →
registerWithCodeium → apiKey (sk-ws-01-) 塌成 Auth1 → PostAuth → apiKey
= sessionToken。删 60 行加 10 行净减 20 行。

Firebase signInWithPassword 现在被 Google App Check 挡了 server-side
也走不通,dispatcher 实际 fall through 到 Auth1 路径,所以邮箱密码登录
功能完整恢复。

测试 904 → 910 (+6 source-level invariants v2090-ott-bypass.test.js)
全测 0 fail / 0 回归。

升级 docker compose pull && docker compose up -d --force-recreate
…l-audit v2.0.88-90

- client.js: cascadeHistoryBudget default 200k→400k, add truncation note
  for trimmed history so model doesn't ask user to repeat
- handlers/chat.js: add IP-rate-limit circuit breaker for non-stream and
  stream paths, record policy blocked + rate limited events
- handlers/messages.js: defensive startMessage() in finish() prevents
  event ordering violation when message stops before it starts
- dashboard/stats.js: track policyBlockedCount and rateLimitedCount,
  persist to stats.json for dashboard visibility
…ling

- Frontend saveGlobalProxy/editAccountProxy now checks API error response
  before showing success toast (fixes silent failure on ERR_PROXY_PRIVATE_HOST)
- parseProxyUrl normalizes whitespace and supports space-separated format
  like "socks5 127.0.0.1 1089" in addition to canonical URL form
- setGlobalProxy/setAccountProxy auto-trim proxy host values
The kimi_k2 vLLM section-token dialect stopped working for kimi-k2
and kimi-k2-thinking — Cascade returns ~16 token empty responses
instead of tool calls. Switching to openai_json_xml which works
for all Moonshot models.

Verified on VPS: kimi-k2 with openai_json_xml correctly emits
<tool_call> XML tool calls.
Verified on VPS (2026-05-07): kimi-k2 through Cascade returns idle_empty
(12-24 tokens, empty content) even for plain chat without tools. This is
an upstream Windsurf model outage, not a proxy bug. The proxy cannot fix
Cascade silently returning empty for a model.
streamResponse was referencing bare `context` for the IP-level rate-limit
circuit breaker, but `context` was never passed to the function. Now passes
`context` through the `deps` parameter and accesses it as `deps.context`,
matching the pattern used by other shared state (cachePolicy, tools, etc.).

Fixes: all Claude models crashing with "Handler error: ReferenceError:
context is not defined at chat.js:3223"
Windsurf PostAuth endpoint now requires the auth1 token as both a header
(X-Devin-Auth1-Token) and in the request body. Previously only sent in
body, causing "missing required header: X-Devin-Auth1-Token" error on
the legacy server.self-serve.windsurf.com endpoint.

Also made postAuthDualPath accept optional extraHeaders parameter for
future-proofing.
Cherry-picked from PR dwgx#142 by @suhaihui-git — adds isCacheEnabled() guard
to cacheGet/cacheSet and exposes enabled field in cacheStats(). Defaults
to enabled (backward-compatible).
…#144 by @Await-d)

Upstream PostAuth now expects:
- Empty application/proto body instead of JSON bridge
- X-Devin-Auth1-Token header + Referer
- Raw response parsing for devin-session-token

Co-authored-by: Await-d
…o body + dwgx#137 proxy parse + cache switch

- P0: ReferenceError context is not defined in streamResponse (dwgx#135)
- PostAuth empty proto body + X-Devin-Auth1-Token + Referer (dwgx#134 via @Await-d PR dwgx#144)
- parseProxyUrl whitespace + frontend error check (dwgx#137)
- RESPONSE_CACHE_ENABLED env (PR dwgx#142 by @suhaihui-git)
- IP rate-limit circuit breaker (#132)
- CLAUDE.md for agent rules
…600KB

- NLU retry now defaults ON for zhipu/glm/moonshot/kimi models (they
  narrate instead of calling tools on first pass). Still requires
  narrative detection to trigger. Set WINDSURFAPI_NLU_RETRY=0 to disable.
  (dwgx#147, dwgx#125, dwgx#120)
- CASCADE_MAX_HISTORY_BYTES default 400KB -> 600KB to reduce silent
  context amputation in long tool-call conversations (dwgx#133)
- kimi-k2 idle_empty now returns clear 502 with model suggestions
  instead of silently passing null/empty content to caller (dwgx#125/dwgx#120)
- expanded neutralization patterns for Windsurf content policy filter
  (prompt-injection, jailbreak, bypass/safety override keywords) (dwgx#130)
- Devin session token + brand name normalization already in place
When Cascade returns idle_empty for kimi-k2 (null/empty content),
return clear error with suggested alternative models instead of
silently passing empty output to the caller.
Moved upstream_model_unavailable check to right before final message
assembly where allText/allThinking values are definitive. Returns clear
502 error when kimi-k2 produces null/empty content.
- kimi-k2 idle_empty returns 502 with model suggestions
- GLM/Kimi NLU retry auto-enabled
- history budget 400KB→600KB
- content policy bypass hardening
- v2.0.91 fixes included: dwgx#135 context crash, dwgx#134 Auth1, dwgx#137 proxy parse, #132 circuit breaker
POST /auth/login now accepts optional proxy field for both single
and batch account creation. Proxy is parsed and bound to the newly
created account as per-account proxy.

Single:  {"token":"...", "proxy":"socks5://1.2.3.4:1080"}
Batch:   {"accounts":[{"token":"t1","proxy":"http://p:8080"},...]}

Verified on VPS — account created with proxy bound in proxy.json.
- VPS credentials and SSH_ASKPASS trick
- issue reply rules: test first, no version numbers, short Chinese
- updated defaults: NLU retry auto ON, history 600KB, maxWait 600s
- release process and common issue triage
@limweb limweb closed this Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants