Skip to content

Monitor MCP capability inference false positives (1 week post-merge of #495) #498

Description

@xinhuagu

Background

PR #495's MCP capability inference (McpCapabilityInference) is pattern-based — verb regexes against method names, field-name lookups for path-typed args. Conservative by design: if patterns match → emit FileWrite/FileDelete/FileMove so the structural-denial layer fires; if not → fall back to McpInvoke and the standard prompt.

The known limitation: a method named write_log(path=...) would be classified as FileWrite even if it's a logging API (false positive — user sees a "write to" prompt when they expected the MCP prompt).

In practice this should be rare — most MCP filesystem-style servers follow the snake/camel conventions the inference is tuned for, and non-filesystem methods rarely use the path field name for non-filesystem identifiers. But we don't know how rare until we ship and watch.

Plan

1 week post-merge of #495:

  • Grep production audit logs (~/.aceclaw/audit/*.jsonl) for @type=FileWrite/FileDelete/FileMove entries where toolName starts with mcp__.
  • Cross-reference with the actual MCP method's purpose (from each server's tool listing).
  • Tally false positives: classifications that prompted/denied something the user knows is NOT a filesystem op.
  • If false-positive rate is meaningful (>5% of MCP-classified calls), bump Schema-aware MCP capability inference (reduce regex false positives) #496 (schema-aware inference) priority.
  • If a specific MCP server is the source of repeated false positives, document a workaround (rename the conflicting arg field, or add the method name to a denylist) until Schema-aware MCP capability inference (reduce regex false positives) #496 lands.

Acceptance

Context

Follow-up to PR #495 — "Known limitations" section flagged the heuristic as needing real-world validation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions