Skip to content

feat: auto-notify pending skill drafts for human review #196

Description

@xinhuagu

Roadmap Position: Step 1 of 6 — current driver
Depends on: existing draft pipeline (SkillDraftGenerator -> ValidationGateEngine -> AutoReleaseController)
Enables: #128, #134, #131

Problem

The learning pipeline can already generate skill drafts automatically, but the user-facing loop is still weak.

Today the daemon can:

  • generate drafts from promoted candidates
  • validate drafts
  • evaluate drafts for auto release

But the user still has poor visibility into what happened.

Drafts are written under .aceclaw/skills-drafts/ and often remain there with disable-model-invocation: true. Validation and release decisions are also mostly hidden in audit/state files. This means the system has internal automation, but the human review loop is still thin.

So the main gap is not "there is no validation path". The main gap is:

  • users are not clearly notified when drafts are created
  • users cannot easily inspect validation/release state from the CLI
  • users cannot easily review, approve, reject, or revisit pending drafts

Why The Original Plan Needs Adjustment

The original direction was right, but some implementation details do not fit the current architecture well.

1. Notification should be event-driven first, not heartbeat-driven first

HeartbeatRunner is currently a scheduler companion that syncs HEARTBEAT.md into cron jobs. It is not the natural first trigger point for skill draft state changes.

Draft creation, validation verdict changes, and release stage changes already happen in the daemon's draft pipeline. Those are the right places to emit user-facing notifications.

Heartbeat can still be useful later as a low-frequency reminder like:

  • "you still have 3 unreviewed drafts"

But it should not be the main source of truth.

2. We should not create a second validation pipeline

The daemon already has:

  • draft generation
  • validation gate evaluation
  • auto release evaluation
  • release lifecycle state (SHADOW -> CANARY -> ACTIVE)

So Phase 2 should not introduce a separate background validation session flow as the new main path. That would risk duplicating logic and splitting the source of truth.

If we later add replay-based validation, it should become another validation gate input inside the existing pipeline, not a parallel pipeline.

3. Skill demotion should extend release control, not bypass it

The current auto-release controller already handles rollback behavior when guardrails fail. Any future demotion logic should extend that same release lifecycle instead of creating a separate skill health subsystem.

Goal

Close the human review loop for generated skill drafts without duplicating the existing draft/validation/release pipeline.

Updated Plan

Phase 1: User-visible notifications and CLI review surface

Add first-class user notifications for draft lifecycle changes.

Emit notifications when:

  • a new skill draft is created
  • a draft validation verdict changes (PASS, HOLD, BLOCK)
  • a skill release stage changes (SHADOW, CANARY, ACTIVE)

Implementation direction:

  • emit daemon-side events directly from the existing draft pipeline
  • surface them to the CLI through the existing notification path
  • keep heartbeat scanning only as a reminder fallback, not as the primary trigger

Add CLI support for draft review, for example:

  • aceclaw skills drafts
  • aceclaw skills drafts --pending
  • aceclaw skills approve <name>
  • aceclaw skills reject <name>
  • aceclaw skills inspect <name>

Minimum CLI output should show:

  • skill name
  • source candidate id
  • current validation verdict
  • current release stage
  • whether manual review is still required

Phase 2: Reuse the existing validation and release pipeline

Do not build a second validation workflow.

Instead:

  • keep SkillDraftGenerator as the draft creation entry
  • keep ValidationGateEngine as the validation source of truth
  • keep AutoReleaseController as the release lifecycle source of truth
  • add better inspection and control around those existing results

If replay-based validation is added later, model it as an additional validation gate pack or metric input to the current validation pipeline.

Phase 3: Smarter reminders and re-review flows

Once the basic notification path exists, add reminder behavior such as:

  • remind on startup if unreviewed drafts exist
  • remind periodically only if drafts remain pending
  • avoid duplicate notifications for unchanged drafts

This is the right place to use heartbeat-style scanning if needed.

Phase 4: Extend release lifecycle with stronger demotion rules

Future work can extend the existing release controller with stronger demotion logic based on rolling health signals.

But this should stay inside the current release lifecycle and audit trail, not as a separate subsystem.

Acceptance Criteria

  1. When a new draft is created, the user sees a clear notification in the CLI.
  2. When a validation verdict changes, the user can see that change without reading audit files by hand.
  3. The CLI can list pending drafts and show which ones still require manual review.
  4. Draft review actions do not bypass the existing validation and release pipeline.
  5. Reminder behavior does not spam the user with duplicate notifications for unchanged drafts.
  6. Future replay-based validation, if added, plugs into the existing validation path instead of creating a second source of truth.

Product-Grade Closed Loop

The longer-term goal is not just "drafts exist" or even "drafts get validated". The real product goal is a full closed loop:

run -> learn -> propose -> validate -> release -> observe -> rollback/demote -> learn again

AceClaw already has parts of the middle of this loop. What is missing is the user-facing control loop and the post-release operating loop.

A product-grade version of this feature should provide:

  1. A single skill lifecycle source of truth.
    Draft, validation, release, rollback, and rejection states should live in one consistent lifecycle model.

  2. User-visible state changes.
    Users should not need to read audit files by hand to discover new drafts, blocked drafts, or stage changes.

  3. Human review and control.
    Users should be able to inspect, approve, reject, pause, and roll back from the CLI without bypassing the existing pipeline.

  4. Strong validation semantics.
    The system should make it clear why a draft passed, held, or failed. Future replay-based validation should plug into the existing validation path.

  5. Post-release observation.
    Active skills should be monitored through clear health signals such as success rate, failure rate, timeout rate, and permission-block rate.

  6. Automatic recovery.
    If a released skill regresses, the system should be able to roll it back or demote it and feed that signal back into learning.

This issue focuses on the first step toward that full loop: making draft and release state visible and reviewable without creating a second validation pipeline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions