Skip to content

ContextMonitor: single source of truth for context usage tracking & display #249

Description

@xinhuagu

Enhancement

Introduce a ContextMonitor component as the single source of truth for context window usage tracking.

Problems (merged from #246, #247)

Bug 1: Context bar disappears after turn completes (was #246)

ForegroundOutputSink.onTurnComplete() calls bottomBar.hide(), removing the bar entirely. Users expect it to persist.
File: ForegroundOutputSink.java:195

Bug 2: Usage percentage jumps erratically — e.g. 10% → 1% (was #247)

During a single turn, the context bar shows sudden drops (e.g. 10% → 1%).

Root cause investigation (2026-03-14):

Previous hypothesis: sub-agent context leaks WRONG

Sub-agent hypothesis was incorrect — the user confirmed no sub-agents were running when the jumps occurred.

TRUE Root Cause: message_start usage dropped, message_delta usage has no input_tokens

Anthropic streaming API sends usage in TWO different SSE events:

SSE event Contains What it means
message_start usage.input_tokens = full, accurate input token count The REAL context occupation
message_delta usage.output_tokens = final output count, input_tokens absent or 0 Only output usage

The bug (code path):

  1. AnthropicStreamSession.handleMessageStart() (line 167-173) — parses message_start but DROPS the usage field:

    handler.onMessageStart(new StreamEvent.MessageStart(id, model));
    // message.path("usage").path("input_tokens") is COMPLETELY IGNORED!
  2. StreamAccumulator.onMessageDelta() (line 971-973) — later receives message_delta and saves its usage:

    this.usage = event.usage();  // This has output_tokens but input_tokens=0!
  3. StreamingAgentLoop (line 297) — reads from accumulator:

    lastInputTokens = accumulator.usage.inputTokens();  // ALWAYS 0 or garbage!
    eventHandler.onUsageUpdate(lastInputTokens, totalInputTokens, totalOutputTokens);
  4. CLI bar receives a near-zero inputTokens and displays 0-1%

Why it sometimes shows 10%: The totalInputTokens accumulates across iterations, and in some code paths a non-zero value leaks through. The inconsistency comes from the race between which value reaches the bar display.

In summary: We are showing message_delta.usage.input_tokens (which is 0) instead of message_start.usage.input_tokens (which is the real value). The real input_tokens is dropped at parse time and never reaches the CLI.

Fix required

  1. StreamEvent.MessageStart — add Usage usage field
  2. AnthropicStreamSession.handleMessageStart() — parse and pass usage from message_start
  3. StreamAccumulator — capture input_tokens from onMessageStart, merge with output_tokens from onMessageDelta
  4. Bar will then show the accurate per-call context occupation

Real UX problems to fix

  1. Bar disappears after turnhideBottomBar() is called, user loses context visibility
  2. input_tokens droppedmessage_start.usage.input_tokens is parsed but discarded, so bar shows near-zero
  3. No compaction indicator — user sees sudden drops with no explanation

Current Architecture (problematic)

Context usage data is scattered across:

  • StreamingAgentLoop tracks lastInputTokens for compaction (but the value is wrong — see above)
  • BottomContextBar gets live updates via stream.usage during streaming only
  • TerminalRepl calculates from cumulative turn totals (fallback path)
  • No component knows real context usage between turns

Proposed Fix

Step 1: Fix the data pipeline (critical)

// StreamEvent.MessageStart — add usage
record MessageStart(String id, String model, Usage usage) implements StreamEvent {}

// AnthropicStreamSession.handleMessageStart — parse usage
Usage usage = mapper.parseUsage(message.path("usage"));
handler.onMessageStart(new StreamEvent.MessageStart(id, model, usage));

// StreamAccumulator — capture input_tokens from message_start
@Override
public void onMessageStart(StreamEvent.MessageStart event) {
    if (event.usage() != null) {
        this.startUsage = event.usage();  // has input_tokens
    }
    delegate.onMessageStart(event);
}

@Override
public void onMessageDelta(StreamEvent.MessageDelta event) {
    // Merge: input_tokens from message_start + output_tokens from message_delta
    this.usage = mergeUsage(this.startUsage, event.usage());
    delegate.onMessageDelta(event);
}

Step 2: Keep bar visible after turn

Do not call bottomBar.hide() on turn complete. Let it persist showing last known context usage.

Step 3: Sub-agent display (future, separate issue)

If sub-agents have context bars, show them as a tree structure under the main agent bar. Not in scope for this fix since the primary bug is the dropped input_tokens.

Acceptance Criteria

  • StreamEvent.MessageStart includes Usage from message_start SSE event
  • StreamAccumulator merges input_tokens (from message_start) with output_tokens (from message_delta)
  • Context bar shows accurate per-call context occupation percentage
  • Context bar persists after turn completes (not hidden)
  • ContextMonitor tracks real inputTokens from every API call
  • Warning logs at 70%, 85%, 95% thresholds

Parent Epic

Part of #248

Subsumes

Closes #246, Closes #247

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or requestp1Priority 1 roadmap itemquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions