chore: bump planner threshold 3→4 (follow-up to #467)#468
Conversation
Initial drop to 3 in #467 went too far the other way: every single "refactor X" / "extract Y" hit threshold 3 and triggered a planner LLM call before any actual work, even on trivial prompts (REFACTORING regex matches "extract" too, so single-line extractions also got caught). 4 is the calibrated middle ground: - Threshold 5: needed two explicit signals → planner never fired - Threshold 3: any +3 signal alone triggered → too noisy - Threshold 4: single-signal +3 stays plain ReAct, +3 with ANY second signal (long description, second action, multiple files, testing, …) triggers planning Users who want planning on a borderline single-signal prompt now use the /plan slash command (also from #467) — which is exactly the escape hatch that makes this trade-off acceptable. Updated test: refactoring_singleSignalDoesNotPlan_atDefaultThreshold (was refactoring_highScore — flipped the shouldPlan assertion to false). New test refactoring_plusSecondSignal_plans pins the "+3 with anything else triggers" rule so future threshold tweaks have to update an explicit assertion rather than silently drift. All test suites green: aceclaw-core, aceclaw-cli, aceclaw-daemon. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 40 minutes and 10 seconds.Comment |
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Review Summary by QodoBump planner threshold 3→4 to reduce false-positive planning triggers
WalkthroughsDescription• Bump planner complexity threshold from 3 to 4 • Single +3 signals no longer trigger planning automatically • Adding any second signal to +3 prompt enables planning • Updated tests to pin threshold behavior and prevent silent regressions Diagramflowchart LR
A["Single +3 signal<br/>e.g. refactor X"] -->|"Threshold 3"| B["Plan triggered<br/>too noisy"]
A -->|"Threshold 4"| C["Plain ReAct<br/>no LLM call"]
D["Single +3 signal<br/>+ second signal<br/>e.g. refactor + tests"] -->|"Threshold 4"| E["Plan triggered<br/>appropriate"]
F["/plan command"] -->|"Force planning"| E
File Changes1. aceclaw-core/src/main/java/dev/aceclaw/core/planner/ComplexityEstimator.java
|
Code Review by Qodo
1. PlannerThreshold Javadoc outdated
|
| * Default complexity score for triggering the planner. Bumped | ||
| * from 5 → 4 (initially landed at 3, dialled back after review). | ||
| * | ||
| * <p>Threshold 5 required two explicit signals — most everyday | ||
| * agentic prompts hit at most one, so the planner essentially | ||
| * never fired. Threshold 3 went too far the other way: every | ||
| * single "refactor X" / "extract Y" (REFACTORING regex matches | ||
| * "extract" too) triggered a planner LLM call before any actual | ||
| * work, even on trivial prompts. | ||
| * | ||
| * <p>Threshold 4 is the middle ground: single-signal +3 prompts | ||
| * ("refactor X" alone) stay as plain ReAct, but adding ANY | ||
| * second signal (a long description, a second action, multiple | ||
| * files, testing, …) flips on planning. Users who explicitly | ||
| * want the planner on a borderline prompt can use | ||
| * {@code /plan <prompt>} as the escape hatch — that bypasses | ||
| * this heuristic entirely. | ||
| * | ||
| * <p>See {@link ComplexityEstimator} for the score table. | ||
| */ | ||
| private static final int DEFAULT_PLANNER_THRESHOLD = 4; | ||
| private static final boolean DEFAULT_ADAPTIVE_REPLAN_ENABLED = true; |
There was a problem hiding this comment.
1. Plannerthreshold javadoc outdated 🐞 Bug ⚙ Maintainability
AceClawConfig.plannerThreshold() Javadoc still claims the default is 5 even though this PR changes DEFAULT_PLANNER_THRESHOLD to 4, so public-facing documentation becomes incorrect and can cause wrong tuning expectations.
Agent Prompt
### Issue description
`AceClawConfig.DEFAULT_PLANNER_THRESHOLD` is now 4, but the Javadoc for `plannerThreshold()` still says the default is 5.
### Issue Context
This mismatch is user-facing (API docs) and can lead to incorrect assumptions when configuring or debugging planner behavior.
### Fix Focus Areas
- aceclaw-daemon/src/main/java/dev/aceclaw/daemon/AceClawConfig.java[1022-1036]
- aceclaw-daemon/src/main/java/dev/aceclaw/daemon/AceClawConfig.java[77-99]
### Suggested change
Update the `plannerThreshold()` Javadoc to state the correct default (4), ideally referencing the constant (e.g., `Defaults to {@value #DEFAULT_PLANNER_THRESHOLD}.`).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Summary
Calibration follow-up to #467. Initial drop to 3 was too noisy — every single `refactor X` / `extract Y` triggered a planner LLM call before any actual work (the REFACTORING regex matches `extract` too, so single-line extractions also got caught). 4 is the middle ground:
Users who want planning on a borderline single-signal prompt use `/plan ` (also from #467) — that's the escape hatch that makes this trade-off OK.
Test plan
🤖 Generated with Claude Code