feat: provider-aware context window auto-detection#38
Conversation
Add contextWindowTokens to ProviderCapabilities with per-provider defaults (Anthropic 200K, OpenAI 128K, Codex 400K). Change AceClawConfig default to 0 (auto-detect). Daemon resolves effective context window from provider capabilities when config doesn't set it explicitly.
📝 WalkthroughWalkthroughThe PR implements provider-aware context window auto-detection by adding a Changes
Sequence Diagram(s)sequenceDiagram
participant Config as AceClawConfig
participant Daemon as AceClawDaemon
participant LLM as LLMClient
participant Caps as ProviderCapabilities
participant Budget as SystemPromptBudget
participant Router as Router
Daemon->>Config: Load configuration<br/>(contextWindowTokens=0 by default)
Daemon->>LLM: Create LLM client
LLM->>Caps: Get provider capabilities
Caps-->>LLM: Return capabilities with<br/>contextWindowTokens
Daemon->>Daemon: Resolve effective contextWindow:<br/>config value > 0?<br/>YES → use config<br/>NO → use provider capability
Daemon->>Budget: Initialize with resolved<br/>contextWindow
Daemon->>Router: Set provider info with<br/>resolved contextWindow
Daemon->>Daemon: Compose system prompt<br/>+ inject skill descriptions
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
|
@greptile review |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
aceclaw-daemon/src/main/java/dev/aceclaw/daemon/AceClawConfig.java (1)
223-228: Consider updating the Javadoc example.The Javadoc mentions "e.g. 200,000 for Claude" but the default is now 0 (auto-detect). Consider updating to clarify that 0 means auto-detect from provider capabilities.
📝 Suggested documentation update
/** - * Returns the context window size in tokens (e.g. 200,000 for Claude). + * Returns the context window size in tokens, or 0 if auto-detecting from provider capabilities. */ public int contextWindowTokens() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@aceclaw-daemon/src/main/java/dev/aceclaw/daemon/AceClawConfig.java` around lines 223 - 228, Update the Javadoc for AceClawConfig.contextWindowTokens() to clarify that the default value 0 means "auto-detect from provider capabilities" and adjust the example: keep a concrete example like "e.g., 200,000 for Claude" but explicitly state that non-zero values override auto-detection and 0 triggers provider-based detection; reference the contextWindowTokens field and the contextWindowTokens() method in the comment for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@aceclaw-core/src/main/java/dev/aceclaw/core/llm/ProviderCapabilities.java`:
- Around line 23-37: Update the inline comments for the ProviderCapabilities
constants to remove ambiguity: clarify that CODEX refers specifically to
OpenAI's Codex/Responses models with a 400K context (not GitHub Copilot Chat) by
changing the comment on the CODEX constant to explicitly name "OpenAI Codex /
Responses API (400K)"; and update the OPENAI_COMPAT comment for the
OPENAI_COMPAT constant to state that 128K is a nominal/default value for
OpenAI-compatible providers and that some vendors (e.g., Ollama) expose
configurable windows (roughly 4K–256K) so actual context may vary.
---
Nitpick comments:
In `@aceclaw-daemon/src/main/java/dev/aceclaw/daemon/AceClawConfig.java`:
- Around line 223-228: Update the Javadoc for
AceClawConfig.contextWindowTokens() to clarify that the default value 0 means
"auto-detect from provider capabilities" and adjust the example: keep a concrete
example like "e.g., 200,000 for Claude" but explicitly state that non-zero
values override auto-detection and 0 triggers provider-based detection;
reference the contextWindowTokens field and the contextWindowTokens() method in
the comment for clarity.
| /** Anthropic Claude: full feature support, 200K context. */ | ||
| public static final ProviderCapabilities ANTHROPIC = | ||
| new ProviderCapabilities(true, true, true, 0); | ||
| new ProviderCapabilities(true, true, true, 0, 200_000); | ||
|
|
||
| /** OpenAI (GPT-4o, o1, etc.): image support, no extended thinking or prompt caching. */ | ||
| /** OpenAI (GPT-4o, o1, etc.): image support, no extended thinking or prompt caching, 128K context. */ | ||
| public static final ProviderCapabilities OPENAI = | ||
| new ProviderCapabilities(false, false, true, 0); | ||
| new ProviderCapabilities(false, false, true, 0, 128_000); | ||
|
|
||
| /** Generic OpenAI-compatible providers (Groq, Together, Ollama, etc.): minimal feature set. */ | ||
| /** Generic OpenAI-compatible providers (Groq, Together, Ollama, etc.): minimal feature set, 128K context. */ | ||
| public static final ProviderCapabilities OPENAI_COMPAT = | ||
| new ProviderCapabilities(false, false, false, 0); | ||
| new ProviderCapabilities(false, false, false, 0, 128_000); | ||
|
|
||
| /** GitHub Copilot Codex models (Responses API): no extended thinking, no caching, no images. */ | ||
| /** GitHub Copilot Codex models (Responses API): no extended thinking, no caching, no images, 400K context. */ | ||
| public static final ProviderCapabilities CODEX = | ||
| new ProviderCapabilities(false, false, false, 0); | ||
| new ProviderCapabilities(false, false, false, 0, 400_000); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Claude 3.5 context window tokens 2025 Anthropic
💡 Result:
In 2025, Anthropic’s Claude 3.5 models (including Claude 3.5 Sonnet and Claude 3.5 Haiku) have a 200,000-token context window (“200K”). [1]
Anthropic’s 1,000,000-token (1M) context window in 2025 was a separate beta feature available for newer models (e.g., Claude Sonnet 4) via a beta header (e.g., context-1m-2025-08-07), not Claude 3.5. [2]
Sources:
[1] Anthropic Docs – Models overview (context window table) [1]
[2] Anthropic Docs – Context windows / 1M token beta header [2]
🌐 Web query:
GPT-4o context window tokens 2025 OpenAI
💡 Result:
OpenAI’s GPT-4o has a 128,000-token context window (and 16,384 max output tokens per request) in the OpenAI API docs. [1]
Related 4o-family variants (different context sizes):
- GPT-4o Audio (preview): 128,000 context tokens. [7]
- GPT-4o Realtime (preview): 32,000 context tokens. [8]
- GPT-4o Transcribe: 16,000 context tokens. [4]
- GPT-4o mini: 128,000 context tokens. [5]
🌐 Web query:
GitHub Copilot Codex context window 2025
💡 Result:
- GitHub Copilot Chat (GitHub.com / Mobile / CLI / supported IDEs): 64k-token context window (when using OpenAI GPT‑4o) per GitHub’s Dec 6, 2024 changelog update (this was the Copilot Chat context-window spec going into 2025). VS Code Insiders could use up to 128k tokens (the max supported by GPT‑4o). [1]
- OpenAI Codex (Codex agent / Codex CLI / IDE extension): GitHub’s “OpenAI Codex” integration is a Copilot-powered coding agent experience (public preview). [2] The GPT‑5.1‑Codex model used for Codex is documented by OpenAI as having a 400,000-token context window. [3]
Sources: [1] [2] [3]
🌐 Web query:
Groq Together Ollama context window OpenAI compatible 2025
💡 Result:
OpenAI-compatible endpoints (base_url)
- Groq (GroqCloud):
https://api.groq.com/openai/v1(console.groq.com) - Together AI:
https://api.together.xyz/v1(docs.together.ai) - Ollama (local): OpenAI-compatible endpoints include
/v1/chat/completions(served from your local Ollama host, e.g.http://localhost:11434) (docs.ollama.com)
Context window (what you can expect in/around 2025)
Groq (varies by model; many popular large-context models are 128K):
- Qwen-2.5-32B: 128K context (groq.com)
- DeepSeek-r1-distill-qwen-32B: 128K context (groq.com)
openai/gpt-oss-120Bandopenai/gpt-oss-20B: 128K context (groq.com)
Together AI (published per-model; serverless table shows many long-context options):
openai/gpt-oss-120b,openai/gpt-oss-20b: 128K context (docs.together.ai)- Example of longer options in their serverless catalog (if you need >128K): Moonshot Kimi models listed at 262,144 context (docs.together.ai)
Ollama (depends on the model and your configured/runtime context allocation):
- Ollama documents default context allocations that scale with VRAM (e.g., <24GiB → 4K; 24–48GiB → 32K; ≥48GiB → 256K) and supports setting context via
OLLAMA_CONTEXT_LENGTH=...when serving. (docs.ollama.com) - OpenAI-compat endpoints are supported (e.g.
/v1/chat/completions). (docs.ollama.com)
Citations:
- 1: https://console.groq.com/docs/openai?utm_source=openai
- 2: https://docs.together.ai/docs/openai-api-compatibility?utm_source=openai
- 3: https://docs.ollama.com/api/openai-compatibility?utm_source=openai
- 4: https://groq.com/blog/groqcloud-now-offers-qwen-2-5-32b-and-deepseek-r1-distill-qwen-32b?utm_source=openai
- 5: https://groq.com/blog/groqcloud-now-offers-qwen-2-5-32b-and-deepseek-r1-distill-qwen-32b?utm_source=openai
- 6: https://groq.com/blog/day-zero-support-for-openai-open-models?utm_source=openai
- 7: https://docs.together.ai/docs/serverless-models?utm_source=openai
- 8: https://docs.together.ai/docs/serverless-models?utm_source=openai
- 9: https://docs.ollama.com/context-length?utm_source=openai
- 10: https://docs.ollama.com/api/openai-compatibility?utm_source=openai
Clarify CODEX context window claim and OPENAI_COMPAT variability.
Current values are largely accurate for 2025:
- ANTHROPIC 200K: ✅ Correct (Claude 3.5 standard context)
- OPENAI 128K: ✅ Correct (GPT-4o documented context)
- CODEX 400K: The OpenAI Codex model has 400K context, but the comment "GitHub Copilot Codex models (Responses API)" is ambiguous—GitHub Copilot Chat uses 64K (based on GPT-4o), while OpenAI's Codex model itself has 400K. Clarify which service this constant represents.
- OPENAI_COMPAT 128K: Reasonable baseline (Groq, Together AI offer 128K), but Ollama's context window is configurable (4K–256K range depending on hardware/settings). Consider documenting that 128K is a nominal default for this category.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@aceclaw-core/src/main/java/dev/aceclaw/core/llm/ProviderCapabilities.java`
around lines 23 - 37, Update the inline comments for the ProviderCapabilities
constants to remove ambiguity: clarify that CODEX refers specifically to
OpenAI's Codex/Responses models with a 400K context (not GitHub Copilot Chat) by
changing the comment on the CODEX constant to explicitly name "OpenAI Codex /
Responses API (400K)"; and update the OPENAI_COMPAT comment for the
OPENAI_COMPAT constant to state that 128K is a nominal/default value for
OpenAI-compatible providers and that some vendors (e.g., Ollama) expose
configurable windows (roughly 4K–256K) so actual context may vary.
Greptile SummaryThis PR implements provider-aware context window auto-detection by adding a Key changes:
The implementation is clean, well-tested, and backward compatible (explicit config still wins). The auto-detection logic is straightforward and correctly propagated throughout the daemon initialization. Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
Start[Daemon Startup] --> LoadConfig[Load AceClawConfig]
LoadConfig --> CheckConfig{config value > 0?}
CheckConfig -->|Yes| UseConfig[Use explicit config]
CheckConfig -->|No DEFAULT=0| CreateClient[Create LlmClient]
CreateClient --> GetCap[Get capabilities]
GetCap --> ExtractCtx[Extract contextWindowTokens]
ExtractCtx --> UseProvider[Use provider default]
UseConfig --> SetCtx[contextWindow variable]
UseProvider --> SetCtx
SetCtx --> LogSrc[Log source: config or auto-detected]
LogSrc --> UseBudget[SystemPromptBudget]
UseBudget --> UseCompact[CompactionConfig]
UseCompact --> UseRouter[router.setProviderInfo]
UseRouter --> Ready[Daemon Ready]
Last reviewed commit: 704d07f |
| // 5b. Inject skill descriptions into system prompt so the LLM knows | ||
| // what each skill does and when to invoke it proactively. | ||
| if (!skillRegistry.isEmpty()) { | ||
| String skillDescriptions = skillRegistry.formatDescriptions(); | ||
| if (!skillDescriptions.isEmpty()) { | ||
| systemPrompt = systemPrompt + "\n\n" + skillDescriptions; | ||
| } | ||
| } |
There was a problem hiding this comment.
unrelated change - skill description injection (lines 247-254) is not mentioned in PR description and should be in a separate commit
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
contextWindowTokensfield toProviderCapabilitieswith per-provider defaults (Anthropic 200K, OpenAI 128K, CODEX 400K, OPENAI_COMPAT 128K)AceClawConfig.DEFAULT_CONTEXT_WINDOWfrom 200K to 0 (auto-detect)llmClient.capabilities()when config doesn't set it explicitly (value > 0 wins)Test plan
ProviderCapabilitiesTest— verifies all 4 constants + custom construction./dev.sh copilotnow showscontext xx/400Kinstead ofxx/200KCloses #30
Summary by CodeRabbit
New Features
Tests