Skip to content

fix(core): never let shell exit results hang on the output drain (#25166)#27842

Open
MartinCajiao wants to merge 4 commits into
google-gemini:mainfrom
MartinCajiao:fix/shell-exit-stuck-awaiting-input
Open

fix(core): never let shell exit results hang on the output drain (#25166)#27842
MartinCajiao wants to merge 4 commits into
google-gemini:mainfrom
MartinCajiao:fix/shell-exit-stuck-awaiting-input

Conversation

@MartinCajiao

@MartinCajiao MartinCajiao commented Jun 11, 2026

Copy link
Copy Markdown

TLDR

Shell commands could complete while the CLI stayed stuck showing the shell as awaiting input (#25166). The exit result of a PTY execution is gated on the output-processing chain, and that gate had no error handling and no bound: a single chunk that threw anywhere in the rendering pipeline — or whose xterm write callback was never invoked — left the execution unresolved forever. The tool call then never left executing, so activeBackgroundExecutionId stayed set and the UI kept reporting an active shell after the process had already exited.

Failure chain

  1. useShellInactivityStatus shows the awaiting/focus state while activePtyId is set.
  2. For model-initiated commands, activePtyId derives from the executing tool call (useGeminiStream.ts): it clears only when the tool's result promise settles.
  3. That promise settles only in finalize() inside ptyProcess.onExit (shellExecutionService.ts), which ran exclusively through:
    Promise.race([processingChain.then(() => 'processed'), abortFired]).then(() => {
      finalize();
    });
  4. Three structural holes:
    • a rejected processingChain rejects the race, and with no rejection handler finalize() never runs (the CLI's global unhandledRejection handler logs and continues, so this manifests as a silent hang, not a crash);
    • a chunk whose headlessTerminal.write callback never fires (xterm swallows callbacks on disposed/paused terminals; Windows ConPTY keeps flushing data after exit while the PTY is destroyed immediately on exit) leaves the chain pending forever, and no timeout existed;
    • finalize() itself could throw (render(true), final serialization), skipping completeWithResult.

What changed

Commit 1 — pure correctness, no behavior change on the happy path:

  • every output-chunk link settles even if it throws (try/catch around the chunk executor, try/finally around the write callback), with a debug log instead of a poisoned chain;
  • the drain race treats a rejected chain as drained and calls finalize() on both race outcomes;
  • finalize() is idempotent and throw-proof end to end: a failure while rendering or serializing the final buffer degrades the captured output instead of hanging the execution;
  • the deferred (debounced) render is guarded: it runs in a 68ms timer outside any caller's try/catch, so a throw there was an uncaught exception that kills the whole CLI — a sibling failure mode of the same unguarded rendering pipeline, surfaced by the regression tests for this change.

Commit 2 — bounded drain (idle watchdog):

  • after exit, if no chunk settles for a full DRAIN_STALL_TIMEOUT_MS window (2s, polled at 250ms, unref'd and cleared on finalize), the execution finalizes with the output buffered so far and logs a warning. The watchdog is idle-based — every settled chunk resets the window — so a slow but advancing drain (large final bursts against a 300k-line scrollback) is never cut short; only a genuinely stuck chain trips it.

Commit 3 — monotonic clock (review feedback):

  • the watchdog measures elapsed time with performance.now() so wall-clock adjustments (NTP, VM migration) can neither fire it prematurely nor delay it; Date.now() remains only where genuine wall-clock timestamps are wanted.

Commit 4 — same guarantee for the child_process fallback:

  • the fallback finalized only on close, which waits for the stdio pipes to end; a grandchild that inherits the pipes and outlives the shell (common on Windows) keeps close from ever firing even though exit was emitted — the same stuck-result family through the non-PTY path. From exit, the wait for close is now bounded the same way: stream activity resets the idle window (capturing the trailing flush), and a hard cap (POST_EXIT_DRAIN_CAP_MS, 10s) covers grandchildren that keep writing indefinitely. A prompt close still wins with no behavior change.

The exit result now always reaches the scheduler on both execution paths; in the worst pathological case the trailing output is degraded, never the exit code, and the stall is logged for diagnosis.

Tests

  • shellExecutionService.test.ts (existing harness, real headless terminal):
    • rendering throws while processing output → result still resolves with the exit code and the buffer-extracted output;
    • a chunk throws before reaching the terminal → result still resolves, warning logged.
  • shellExecutionService.drain.test.ts (new, controllable terminal mock):
    • a write callback that is never invoked → watchdog finalizes after the stall window, warning logged;
    • a slow drain that keeps making progress past the stall window in total time → never cut short, no warning;
    • nothing left to drain → resolves immediately, no watchdog side effects.
  • child_process fallback (existing harness):
    • exit fires but close never does (pipes held by a grandchild) → finalizes after the idle window with the output received, warning logged;
    • a grandchild keeps writing through the pipes → the hard cap ends the wait;
    • a prompt close → unchanged behavior, no warning.
  • Red/green: every hang-scenario test above times out against the unfixed code and passes with this change.

Known boundary (intentionally out of scope)

  • If node-pty never emits onExit, nothing in this file can recover — the watchdog lives inside the exit handler. Recovering from that would require process-liveness polling, and the codebase itself documents process.kill(pid, 0) returning false negatives for ConPTY-managed wrappers on Windows — a liveness watchdog built on that signal could finalize healthy long-running sessions. That variant, if it exists in the wild, needs separate evidence first.

Fixes #25166

The exit result of a PTY execution is gated on the output processing
chain: finalize() - the only path that resolves the result - ran
exclusively through Promise.race(processingChain, abortFired) with no
rejection handling. A chunk that threw anywhere in the rendering
pipeline poisoned the chain, the race rejected, and finalize() never
ran. The tool call then stayed `executing` forever and the UI kept
reporting the shell as awaiting input after the process had already
exited (google-gemini#25166); the global unhandledRejection handler logs and
continues, so this manifested as a silent hang rather than a crash.

- Settle every output chunk even when it throws (try/catch around the
  chunk executor, try/finally around the terminal write callback),
  logging instead of poisoning the chain.
- Treat a rejected chain as drained and run finalize() on both race
  outcomes.
- Make finalize() idempotent and throw-proof: failures while rendering
  or serializing the final buffer degrade the captured output instead
  of skipping completeWithResult().
- Guard the deferred (debounced) render: it runs in a timer outside
  any caller try/catch, where a throw becomes an uncaught exception
  that kills the CLI.
Even with every chunk settling, the exit result still waits on the
output chain draining through the headless terminal, and a write
callback that is never invoked (xterm swallows callbacks on disposed
or paused terminals; Windows ConPTY keeps flushing data after exit
while the PTY is destroyed immediately) used to leave the execution
unresolved forever - the visible symptom of google-gemini#25166.

After exit, an idle watchdog now polls drain progress: every settled
chunk resets the window, so a slow but advancing drain (large final
bursts against a 300k-line scrollback) is never cut short, and only a
genuinely stuck chain trips it. When it fires, the execution finalizes
with the output buffered so far and logs a warning so field reports
can confirm which vector was hit. The interval is unref d and cleared
by finalize().
@MartinCajiao MartinCajiao requested a review from a team as a code owner June 11, 2026 01:50
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where shell executions could hang indefinitely in an 'awaiting input' state after the process had actually exited. The root cause was an unguarded output-processing chain that would block the finalization of the execution if any chunk failed or if the terminal write callback was never invoked. The changes introduce a robust, fault-tolerant processing pipeline and an idle-based watchdog that guarantees the execution lifecycle completes, ensuring the UI accurately reflects the process status.

Highlights

  • Resilience in Output Processing: Added comprehensive error handling to the shell output processing chain, ensuring that exceptions during rendering or serialization do not leave the execution in a permanently hung state.
  • Exit Drain Watchdog: Implemented an idle-based watchdog timer that monitors the output drain process after a shell exits, forcing finalization if the drain stalls for more than 2 seconds.
  • Idempotent Finalization: Refactored the finalization logic to be idempotent and robust, ensuring the CLI always completes the execution lifecycle even if final rendering or serialization fails.
  • Regression Testing: Added new unit tests in shellExecutionService.drain.test.ts and regression cases in shellExecutionService.test.ts to verify behavior under stalled drain conditions and rendering failures.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions Bot added the size/l A large sized PR label Jun 11, 2026
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

📊 PR Size: size/L

  • Lines changed: 683
  • Additions: +624
  • Deletions: -59
  • Files changed: 3

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a post-exit output drain watchdog to resolve issues where shell executions could hang indefinitely due to stalled output processing or swallowed terminal write callbacks. It adds robust error handling around the rendering pipeline and chunk processing to ensure that failures do not block exit finalization. Additionally, new unit tests are added to verify exit finalization resilience and the watchdog behavior. The reviewer feedback suggests replacing Date.now() with performance.now() to provide a monotonic clock source, ensuring that system clock adjustments do not cause premature or delayed timeouts.

Comment thread packages/core/src/services/shellExecutionService.ts Outdated
Comment thread packages/core/src/services/shellExecutionService.ts Outdated
Addresses review feedback: Date.now() is wall-clock time, so an NTP
adjustment or VM migration could fire the stall watchdog prematurely
(clock jumps forward) or delay it (clock jumps backward).
performance.now() is monotonic and immune to clock adjustments. The
wall-clock Date.now() uses for history timestamps are untouched.
@MartinCajiao

Copy link
Copy Markdown
Author

Both suggestions addressed in 5a0083b: the drain watchdog now uses the monotonic clock (performance.now()) for both the activity marker and the stall check, so NTP/wall-clock adjustments can neither fire it prematurely nor delay it. The Date.now() uses for history timestamps are intentionally unchanged (those are genuine wall-clock values). Full suite still green: 71/71, typecheck clean.

@gemini-cli gemini-cli Bot added priority/p1 Important and should be addressed in the near term. area/core Issues related to User Interface, OS Support, Core Functionality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. labels Jun 11, 2026
The fallback path finalized only on close, which waits for the stdio
pipes to end. A grandchild that inherits the pipes and outlives the
shell (common on Windows) keeps close from ever firing even though
exit was emitted - the same stuck-result family as google-gemini#25166, through the
non-PTY path.

From exit, the wait for close is now bounded the same way the PTY
drain is: stream activity resets an idle window so the trailing flush
is captured, and a hard cap (POST_EXIT_DRAIN_CAP_MS) covers
grandchildren that keep writing indefinitely so they cannot hold the
exit result hostage. handleExit is idempotent and clears the watchdog;
a prompt close still wins with no behavior change on the happy path.
@MartinCajiao

Copy link
Copy Markdown
Author

Scope update: pushed 4677496 closing one of the two boundaries originally declared out of scope.

The child_process fallback finalized only on close, which waits for the stdio pipes to end — a grandchild that inherits the pipes and outlives the shell (common on Windows) keeps close from ever firing even though exit was emitted: the same stuck-result family as #25166 through the non-PTY path. Since it shares the watchdog constants and test patterns with the PTY fix, it belongs in this PR rather than a separate one; happy to split it out if reviewers prefer.

Mechanism mirrors the PTY drain: from exit, stream activity resets an idle window (capturing the trailing flush) and a hard cap (POST_EXIT_DRAIN_CAP_MS, 10s) covers grandchildren that keep writing indefinitely. A prompt close still wins, with no behavior change on the happy path. Three new tests cover all three scenarios; the two hang scenarios time out against the unfixed code and pass with the change. Suite now 74/74, typecheck clean.

The remaining declared boundary (node-pty never emitting onExit) stays intentionally out of scope: a liveness watchdog would have to trust process.kill(pid, 0), which this codebase itself documents as returning false negatives for ConPTY-managed wrappers on Windows — a cure that could finalize healthy long-running sessions is worse than the disease without field evidence first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. priority/p1 Important and should be addressed in the near term. size/l A large sized PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shell command execution gets stuck with "Waiting input" after command completes

1 participant