feat(attribution): per-source LLM request breakdown on JSON-RPC + /status (PR B, #419)#421
Conversation
… (PR B, #419) Second slice of #419 (epic #418). Takes the attribution data PR A plumbed through the core agent types and: 1. Serializes it onto the JSON-RPC agent.prompt result. Every usage node now carries an additive `llmRequestsBySource` field (map of lowercase source name to count) alongside the existing scalar `llmRequests`. Invariant: `sum(values) == llmRequests`. 2. Lets the CLI accumulate per-source totals in ContextMonitor, fed from that new field. 3. Renders an inline breakdown on `/status`: LLM requests: 8 (main=5 planner=2 cont=1) Short labels via a local display map; empty suffix when the daemon didn't send the field (old-daemon compatibility) or all counts are zero. Protocol change is strictly additive — old CLIs ignore the new field and keep reading the scalar; new CLIs on old daemons get an empty breakdown and render the scalar alone. Notable behavioral fix caught along the way: PR A pointed out the MessageCompactor summary call was uncounted in llmRequestCount. Its sibling, the LLMTaskPlanner request on planned flows, had the same issue: `llmRequests` reflected only stepResults' counts, not the planner's one-shot. This PR threads plannerAttribution through that call site and merges it into the plan's serialized usage, so both the scalar and the per-source map match real API cost. Resumed plans (`plan.resume` flow) don't re-run the planner — their plan was already generated and checkpointed — so that path reports PlanExecutionResult.requestAttribution() directly. Tests: - ContextMonitorTest: * recordLlmRequestsBySourceAccumulatesPerSource — two turns' breakdowns sum per source. * recordLlmRequestsBySourceIgnoresNullEmptyAndInvalidValues — null, empty, blank keys, zero/negative values don't corrupt state. * totalLlmRequestsBySourceReturnsUnmodifiableSnapshot. - TerminalReplTest: * status_renderBreakdownOnlyWhenDaemonSuppliedPerSourceMap — no per-source map → scalar-only line, no trailing `()`. * status_rendersInlineBreakdownFromPerSourceMap — end-to-end: feed monitor a map, /status renders `main=5 planner=2 cont=1`. * formatLlmRequestsBreakdown_roundTripsKnownSources — ordered formatter output with stable key ordering. * formatLlmRequestsBreakdown_emptyOrZeroReturnsEmpty — all-zero collapses, null/empty return "". * formatLlmRequestsBreakdown_unknownSourcePassesThroughAsRawKey — forward-compat: daemon adding a new source doesn't silently drop. Refs #418, #419 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 41 minutes and 40 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThis PR adds per-source LLM request tracking across the CLI and daemon. ContextMonitor now records and exposes aggregated request counts by source using Changes
Sequence DiagramsequenceDiagram
participant Agent as StreamingAgentHandler<br/>(Daemon)
participant Planner as Planner
participant Attribution as RequestAttribution
participant CLI as TerminalRepl<br/>(CLI)
participant Monitor as ContextMonitor<br/>(Session)
Agent->>Attribution: Create plannerAttribution<br/>for planner's LLM call
Agent->>Planner: plan(prompt, toolDefs,<br/>plannerAttribution)
Planner->>Planner: Make LLM requests<br/>track in attribution
Planner-->>Agent: Return plan with<br/>attribution data
Agent->>Agent: Merge plan attribution<br/>with step results
Agent->>Agent: Emit usage with<br/>llmRequestsBySource<br/>from merged attribution
Agent-->>CLI: Send usage.llmRequestsBySource
CLI->>CLI: parseLlmRequestsBySource()<br/>extract per-source map
CLI->>Monitor: recordLlmRequestsBySource(map)
Monitor->>Monitor: Aggregate into<br/>LinkedHashMap<String,Long>
CLI->>Monitor: /status command
Monitor-->>CLI: totalLlmRequestsBySource()
CLI->>CLI: formatLlmRequestsBreakdown()<br/>render inline breakdown
CLI-->>Agent: Display "(main=X, planner=Y)"
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested labels
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (2 errors, 1 warning)
✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Self-review cleanups on PR B (#421): - ContextMonitor: added proper imports for Collections, LinkedHashMap, and Map instead of fully qualifying them inline. The file already imports ArrayDeque / Locale, so the new types fit the existing style. - StreamingAgentHandler: dropped `java.util.Locale.ROOT` -> `Locale.ROOT` at line 608; Locale was already imported. (The existing ObjectNode FQCN at 1977 is intentional — the file's style is to not import ObjectNode, so I left my matching usage that way.) No behavior change; compiles clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9e9b0535ee
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| out.printf(" %sLLM requests:%s %d%s%n", MUTED, RESET, | ||
| contextMonitor.totalLlmRequests(), | ||
| formatLlmRequestsBreakdown(contextMonitor.totalLlmRequestsBySource())); |
There was a problem hiding this comment.
Hide breakdown when per-source map is only partially populated
/status always appends formatLlmRequestsBreakdown(totalLlmRequestsBySource()) whenever the map is non-empty, but totalLlmRequests is accumulated even for turns that had no llmRequestsBySource payload. In a mixed session (e.g., some turns from an older daemon, then reconnect to a newer daemon), this prints a partial breakdown next to the full scalar total, which misreports attribution (for example LLM requests: 8 (main=3) where 5 requests are unaccounted). Consider suppressing the suffix unless the per-source sum matches the scalar total, or adding an explicit unknown bucket.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
aceclaw-daemon/src/main/java/dev/aceclaw/daemon/StreamingAgentHandler.java (1)
668-674:⚠️ Potential issue | 🟠 MajorAccumulate
RequestAttributionin adaptive continuation to preserve per-source attribution.
runTurnWithAdaptiveContinuationaccumulatestotalLlmRequests(line 1082) but does not accumulateRequestAttribution. When the merged turn is created at lines 1143–1145, it omits therequestAttributionparameter, causing the per-source breakdown from individual segments to be lost. This means line 673's call towriteLlmRequestsBySource()will use an empty or unset attribution, and the/statusendpoint will not report per-source request counts for adaptive continuation.Accumulate
RequestAttributionalongside the scalar count, and pass the merged attribution when constructing the final turn:var totalRequestAttribution = RequestAttribution.builder(); // Inside the loop: totalRequestAttribution.merge(turn.requestAttribution()); // When creating merged turn: var mergedAttribution = totalRequestAttribution.build(); var mergedTurn = new dev.aceclaw.core.agent.Turn( mergedMessages, lastStopReason, usage, lastCompaction, maxIterationsReached, budgetExhausted, budgetExhaustionReason, totalLlmRequests, mergedAttribution);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@aceclaw-daemon/src/main/java/dev/aceclaw/daemon/StreamingAgentHandler.java` around lines 668 - 674, runTurnWithAdaptiveContinuation currently accumulates only totalLlmRequests but loses per-source RequestAttribution when creating the merged Turn; declare a RequestAttribution.Builder (e.g., totalRequestAttribution = RequestAttribution.builder()) before the loop, call totalRequestAttribution.merge(turn.requestAttribution()) inside the loop where totalLlmRequests is accumulated, then build the merged attribution (totalRequestAttribution.build()) and pass that as the requestAttribution argument when constructing the merged dev.aceclaw.core.agent.Turn (instead of omitting it) so writeLlmRequestsBySource(...) sees the correct per-source counts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@aceclaw-cli/src/main/java/dev/aceclaw/cli/ContextMonitor.java`:
- Around line 180-185: recordLlmRequestsBySource currently merges the incoming
map keys as-is which breaks the lowercase key contract; update
recordLlmRequestsBySource in ContextMonitor to normalize each source before
using it (e.g., trim() and toLowerCase(Locale.ROOT)) and then perform the
null/blank and count checks and merge into totalLlmRequestsBySource using the
normalizedKey; ensure you reference the same Long::sum merge and keep the
synchronized method semantics.
In `@aceclaw-cli/src/main/java/dev/aceclaw/cli/TerminalRepl.java`:
- Around line 1233-1243: The method formatLlmRequestsBreakdown currently prints
map keys directly (via LLM_REQUEST_SOURCE_DISPLAY.getOrDefault(...)), which can
include control/ANSI characters; update formatLlmRequestsBreakdown to sanitize
the resolved label before appending it to parts—trim it, replace or remove
ANSI/control sequences (e.g., strip characters matching control/escape patterns
like \u001B and other \p{Cntrl} chars), and fall back to a safe placeholder
(e.g., "unknown") if the sanitized label is empty; keep all other behavior the
same so counts remain unchanged.
---
Outside diff comments:
In `@aceclaw-daemon/src/main/java/dev/aceclaw/daemon/StreamingAgentHandler.java`:
- Around line 668-674: runTurnWithAdaptiveContinuation currently accumulates
only totalLlmRequests but loses per-source RequestAttribution when creating the
merged Turn; declare a RequestAttribution.Builder (e.g., totalRequestAttribution
= RequestAttribution.builder()) before the loop, call
totalRequestAttribution.merge(turn.requestAttribution()) inside the loop where
totalLlmRequests is accumulated, then build the merged attribution
(totalRequestAttribution.build()) and pass that as the requestAttribution
argument when constructing the merged dev.aceclaw.core.agent.Turn (instead of
omitting it) so writeLlmRequestsBySource(...) sees the correct per-source
counts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bee45d1c-d951-432a-a995-d611ac511891
📒 Files selected for processing (5)
aceclaw-cli/src/main/java/dev/aceclaw/cli/ContextMonitor.javaaceclaw-cli/src/main/java/dev/aceclaw/cli/TerminalRepl.javaaceclaw-cli/src/test/java/dev/aceclaw/cli/ContextMonitorTest.javaaceclaw-cli/src/test/java/dev/aceclaw/cli/TerminalReplTest.javaaceclaw-daemon/src/main/java/dev/aceclaw/daemon/StreamingAgentHandler.java
Addresses CodeRabbit's three findings on #421: 1. MAJOR (outside-diff): runTurnWithAdaptiveContinuation accumulated the scalar llmRequestCount across segments but not the per-source RequestAttribution. When the merged Turn was built, its requestAttribution defaulted to empty — so for exactly the turns that ran long enough to split into segments, /status lost the breakdown and showed only the scalar total. Fixed by carrying a RequestAttribution.Builder through the segment loop, merging each turn.requestAttribution() as we go, and passing the build() result to the merged Turn's canonical constructor. 2. INLINE: ContextMonitor.recordLlmRequestsBySource now normalizes incoming keys via trim() + toLowerCase(Locale.ROOT) before merging into the running total. Defense-in-depth against a future daemon shipping mixed-case keys — the current daemon always sends lowercase so the normalization is a no-op today. 3. INLINE: formatLlmRequestsBreakdown now routes labels through sanitizeTerminalText, stripping ANSI / control chars from unknown source keys that pass through from the wire. Known labels are already clean but go through the same path for consistency. Empty sanitized labels fall back to "unknown" instead of "=N". Tests: - ContextMonitorTest.recordLlmRequestsBySourceNormalizesKeyCase — "MAIN_TURN" + "main_turn" + "Main_Turn" collapse to one bucket. - TerminalReplTest.formatLlmRequestsBreakdown_sanitizesControlCharsInUnknownKeys — ANSI escapes are stripped; all-escape key collapses to "unknown". No new daemon-side test for the adaptive-continuation attribution preservation: exercising that path requires provoking a multi-segment turn which needs the segmented-loop test infra that doesn't currently exist. The fix is a three-line mechanical change (declare builder, merge per iteration, pass to canonical ctor) verifiable by inspection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
PR B of #419 (epic #418). Takes the attribution data PR A (merged in #420) plumbed through the core agent types and makes it user-visible:
Before / After
Before (PR A only):
```
LLM requests: 8
```
After:
```
LLM requests: 8 (main=5 planner=2 cont=1)
```
When talking to an older daemon (or on a fresh session with no requests yet), the parenthetical is omitted — just the scalar.
Backward compatibility
Protocol change is strictly additive:
Invariant locked in tests: `sum(llmRequestsBySource.values()) == llmRequests` on every payload.
Behavioral fix caught along the way
PR A's review surfaced that the compaction summary LLM call was uncounted in `llmRequestCount`. Its sibling — the `LLMTaskPlanner` request on planned flows — had the same bug: the scalar reported only stepResults' counts, not the planner's one-shot. This PR threads `plannerAttribution` through the planner call site and merges it into the plan's serialized usage, so both the scalar and the per-source map match real API cost.
Resumed plans (`plan.resume` flow) don't re-run the planner — their plan was already generated and checkpointed — so that path reports `PlanExecutionResult.requestAttribution()` directly.
Wire format
```json
{
"usage": {
"inputTokens": 1234,
"outputTokens": 567,
"totalTokens": 1801,
"llmRequests": 8,
"llmRequestsBySource": {
"main_turn": 5,
"planner": 2,
"continuation": 1
}
}
}
```
Keys match the lowercase enum names. Short display names (`main`, `cont`, `compact`) live only in the CLI's display map; the wire keeps the full name so external telemetry can correlate exactly with the daemon's `RequestSource` enum.
Scope boundaries
Test plan
Refs #418, #419
🤖 Generated with Claude Code
Summary by CodeRabbit