| name | financial-model-review |
|---|---|
| description | Reviews startup financial models for investor readiness — validates unit economics, stress-tests runway scenarios, and benchmarks metrics against stage-appropriate targets. Accepts Excel, CSV, or text. |
| when_to_use | Use ONLY when the user has provided a financial model file (Excel/CSV) or a structured numerical model in pasted form, AND has asked for review, validation, runway analysis, or unit-economics scoring. Do not auto-invoke on general questions about financial models or fundraising metrics. |
| user-invocable | true |
Help startup founders understand how investors will evaluate their financial model — validating structure, unit economics, runway, and metrics against stage-appropriate standards. Produce a thorough review with actionable improvements. The tone is founder-first: a rigorous but supportive coaching session.
- Author: lool-ventures
- Version: managed in
founder-skills/.claude-plugin/plugin.json - Compatibility: Python 3.10+ and
uvfor script execution.openpyxlrequired for Excel parsing. - Imports (optional):
market-sizing:sizing.json— validate revenue-to-SOM consistencydeck-review:checklist.json— cross-check model-to-deck number alignment
- Exports:
report.json→ic-sim,fundraise-readiness,dd-readinessunit_economics.json→metrics-benchmarker,ic-simrunway.json→fundraise-readiness
This skill runs inline in the main thread (not as a sub-agent). The main thread has full tool access including Bash, and is responsible for orchestrating the full pipeline: running producer scripts, persisting artifacts, and dispatching the financial-model-review sub-agent at specific moments.
Two dispatch contexts for the sub-agent:
- Context A — Per-step analytical dispatch (Mitigation 1): The INPUTS_REVIEW and CHECKLIST steps dispatch the financial-model-review agent via the
Tasktool. The agent does deep analysis and returns structured JSON. The main thread captures the JSON and pipes it through the producer script. The sub-agent does NOT write artifacts directly. (Unit economics and runway are NOT dispatched — those producers consumeinputs.jsonverbatim, so the main thread pipes the file directly.) - Context B — Post-compose coaching dispatch: The final step dispatches the sub-agent after
compose_report.pywritesreport.md. The sub-agent consumes thecoaching_payloadinlined in its dispatch prompt (it does NOT read fullreport.md), inserts## Coaching Commentaryvia the uuid insertion marker, Grep-verifies all canonical artifacts on disk, and returns a structured success payload.
Why this model: In Cowork, sub-agents have a restricted tool allowlist (no Bash). By keeping orchestration in the main thread and dispatching sub-agents only for analytical or post-compose tasks that use only Read/Edit/Glob/Grep, the pipeline works correctly in both Claude Code (CLI) and Cowork.
Tolerant JSON extraction protocol (Context A): After dispatching the sub-agent, capture its final assistant message. The sub-agent should return raw JSON, but may wrap it in ```json ... ``` fences or add a prose preamble. Extract JSON tolerantly:
- If the message is wrapped in a
```json ... ```(or plain``` ... ```) fence, strip the fence first. - Try to parse the stripped text directly as JSON.
- If that fails, walk through the text looking for the first
{character and tryjson.JSONDecoder().raw_decode(text[i:])— this is brace-aware and handles nested objects correctly (unlike regex, which truncates on the first}). - If extraction fails entirely, re-prompt the sub-agent with: "Your previous reply could not be parsed as JSON. Return ONLY the JSON object — no markdown fences, no prose preamble."
Context-pressure note: This skill has the highest context budget of the 5 skills. The win from Mitigation 1 is excluding sub-agent reasoning and the 40-60 KB raw extract_model.py output — which flows through the INPUTS_REVIEW dispatch: the sub-agent reads it in its own context window, returns only the corrected inputs.json. The artifacts themselves still accumulate in the main thread (~80-130K total), but that is manageable.
See
founder-skills/references/skill-execution-model.mdfor the full inline-skill execution model (3 dispatch contexts, Mitigation 1+2, producer contract, Cowork quirks, per-symptom triage).
Accept any format: Excel (.xlsx), CSV, Google Sheets exports, financial documents, or conversational input. For Excel files, use extract_model.py to parse. For other formats, extract data manually into the inputs.json schema. If multiple copies of the same file exist (e.g., Financials.xlsx and Financials (1).xlsx), use the most recently modified version and note the duplication to the founder. If timestamps are identical, ask the founder which file to use. If the founder cannot be queried, prefer the file without parenthetical suffixes (e.g., (1), (2)) — these typically indicate browser re-download duplicates.
All scripts are at ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/scripts/:
extract_model.py— Extracts structured data from Excel (.xlsx) and CSV filesvalidate_extraction.py— Anti-hallucination gate: cross-referencesmodel_data.jsonagainstinputs.jsonto catch mismatches (company name, salary, revenue, cash traceability); run after extraction, before reviewvalidate_inputs.py— Four-layer validation ofinputs.json(structural, consistency, sanity, completeness); supports--fixto auto-correct sign errorschecklist.py— Scores 46 criteria across 7 categories with profile-based auto-gatingunit_economics.py— Computes and benchmarks 11 unit economics metricsrunway.py— Multi-scenario runway stress-test with decision pointscompose_report.py— Assembles report with cross-artifact validation;--strictexits 1 on high-severity warnings (corrupt/missing artifacts)apply_corrections.py— Processes founder's downloaded corrections file: coerces types, normalizes ILS→USD, merges overrides, writescorrected_inputs.jsonandextraction_corrections.jsonverify_review.py— Review completeness gate: checks artifact existence, content quality, and cross-artifact consistency;--gate 1for after-compose,--gate 2(default) for final; exit 0 = publishable, exit 1 = gaps remainvisualize.py— Generates self-contained HTML with SVG charts (not JSON)explore.py— Generates self-contained interactive HTML explorer from review artifacts; outputs HTML (not JSON)review_inputs.py— Dual-mode review viewer: HTTP server with live validation (Claude Code) or self-contained static HTML with JS sanity metrics (Cowork); outputs HTML
Also available from ${CLAUDE_PLUGIN_ROOT}/scripts/ (shared):
find_artifact.py— Resolves artifact paths by skill name and filename (used for cross-skill lookups)
Run with: python3 ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/scripts/<script>.py --pretty [args]
Read as needed from ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/references/:
checklist-criteria.md— All 46 checklist criteria with gate definitionsschema-inputs.md— JSON schema forinputs.json(the artifact the agent writes)artifact-schemas.md— JSON schemas for script-produced output artifactsdata-sufficiency.md— Data sufficiency gate and qualitative pathextraction-pitfalls.md— 8 common extraction errors (scale denomination, payroll aggregation, collections vs revenue, etc.)
From ${CLAUDE_PLUGIN_ROOT}/references/ (shared): stage-expectations.md, benchmarks.md, israel-guidance.md, revenue-model-types.md, common-mistakes.md
Every review deposits structured JSON artifacts into a working directory. The final step assembles all artifacts into a report and validates consistency. This is not optional.
| Step | Artifact | Producer |
|---|---|---|
| 1 | founder context | founder_context.py read/init |
| 2 | model_data.json |
extract_model.py (Excel/CSV in main thread) |
| 3 | inputs.json |
Context A dispatch: INPUTS_REVIEW → apply_corrections.py |
| 3.5 | corrected_inputs.json |
apply_corrections.py (from INPUTS_REVIEW dispatch) |
| 3.6 | extraction_validation.json |
validate_extraction.py (when model_data.json exists) |
| 4 | checklist.json |
Context A dispatch: CHECKLIST → checklist.py |
| 5 | unit_economics.json |
direct pipe: inputs.json → unit_economics.py |
| 6 | runway.json |
direct pipe: inputs.json → runway.py |
| 7 | Report | compose_report.py (writes both report.json and report.md) |
| 7.5 | commentary.json |
agent-authored (main thread heredoc) — required by Gate 2 for quantitative reviews |
| 8a | HTML report | visualize.py |
| 8b | Explorer | explore.py |
| 8c | Coaching | Context B dispatch: POST_COMPOSE_COACHING |
Rules:
- Deposit each artifact before proceeding to the next step
- For agent-written artifacts (inputs.json), consult
references/schema-inputs.mdfor the JSON schema - If a step is not applicable, deposit a stub:
{"skipped": true, "reason": "..."} - Do NOT use
isolation: "worktree"for sub-agents — files written in a worktree won't appear in the main$REVIEW_DIR
Keep the founder informed with brief, plain-language updates at each step. Never mention file names, scripts, or JSON. After each analytical step (3–6), share a one-sentence finding before moving on.
Every Bash tool call runs in a fresh shell — variables do not persist. Prefix every Bash call that uses these paths with the variable block below, or substitute absolute paths directly:
SCRIPTS="${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/scripts"
REFS="${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/references"
SHARED_SCRIPTS="${CLAUDE_PLUGIN_ROOT}/scripts"
SHARED_REFS="${CLAUDE_PLUGIN_ROOT}/references"
if ls "$(pwd)"/mnt/*/ >/dev/null 2>&1; then
ARTIFACTS_ROOT="$(ls -d "$(pwd)"/mnt/*/ | head -1)artifacts"
elif ls "$(pwd)"/sessions/*/mnt/*/ >/dev/null 2>&1; then
ARTIFACTS_ROOT="$(ls -d "$(pwd)"/sessions/*/mnt/*/ | head -1)artifacts"
else
ARTIFACTS_ROOT="./artifacts"
fiIf CLAUDE_PLUGIN_ROOT is empty OR the path it resolves to does not exist in your environment (in Claude Cowork it substitutes to a host-side path that is not present inside the session VM — test with ls), fall back: run Glob with pattern **/skills/financial-model-review/scripts/checklist.py, strip to get SCRIPTS, derive REFS and SHARED_SCRIPTS. In Claude Cowork this is always the case — don't retry the substituted path; go straight to the Glob fallback. If Glob returns multiple matches, prefer the one under a plugin mount (.remote-plugins/ or the plugins cache) over any workspace copy. If Glob returns nothing, locate it with Bash: find / -path '*/skills/financial-model-review/scripts/checklist.py' 2>/dev/null | head -5.
If ARTIFACTS_ROOT resolves to ./artifacts but no artifacts/ directory exists at $(pwd): The workspace may not be mounted yet. Use Glob with pattern **/artifacts/founder_context.json to locate existing artifacts, and derive ARTIFACTS_ROOT from the result. If nothing is found, mkdir -p ./artifacts and proceed.
After Step 1 (when the slug is known):
REVIEW_DIR="$ARTIFACTS_ROOT/financial-model-review-${SLUG}"
mkdir -p "$REVIEW_DIR"
mkdir -p "$REVIEW_DIR/.staging" # for ad-hoc sub-agent JSON staging
RUN_ID="$(date -u +%Y%m%dT%H%M%SZ)"Pass RUN_ID to all sub-agents. The four producer artifacts (inputs.json, checklist.json, unit_economics.json, runway.json) must carry "metadata": {"run_id": "$RUN_ID"} at the top level — the producers propagate it from their stdin payloads; never hand-edit script outputs to add it. (model_data.json and extraction_validation.json have no run_id by design.) compose_report.py checks that all four run IDs match — a mismatch triggers a STALE_ARTIFACT high-severity warning, blocking under --strict.
If REVIEW_DIR already contains artifacts from a previous run, remove them before starting:
rm -f "$REVIEW_DIR"/{inputs,checklist,unit_economics,runway,report,model_data,extraction_validation,corrected_inputs,extraction_corrections,corrections_from_agent,commentary}.json \
"$REVIEW_DIR/report.html" "$REVIEW_DIR/explore.html" "$REVIEW_DIR/review.html" "$REVIEW_DIR/report.md"
In Cowork, file deletion may require explicit permission. If cleanup fails with "Operation not permitted", request delete permission and retry before proceeding.
python3 "$SHARED_SCRIPTS/founder_context.py" read --artifacts-root "$ARTIFACTS_ROOT" --prettyThree cases based on exit code:
Exit 0 (found, single context): Use the company slug and pre-filled fields. Before proceeding to extraction, use AskUserQuestion to ask the founder for current cash balance and date if not already stated in the conversation — this is the #1 cause of incomplete runway analysis. If files are attached, also ask about monthly burn rate unless the conversation already contains it. Batch all questions into a single AskUserQuestion call.
Exit 1 (not found): Use AskUserQuestion (NOT plain chat) to ask the founder for company details AND key financial context. You MUST use the AskUserQuestion tool — do not just list questions in the chat. Gather everything in a single call (one interaction = one chance for the UI to render correctly):
- Company name, stage, sector, geography (required for context creation)
- Current cash balance and date (critical for runway — the #1 cause of incomplete reports)
- Monthly burn rate if not obvious from the provided files
IMPORTANT: Always use the AskUserQuestion tool for founder questions — never ask as plain chat text. The tool provides a structured UI that renders correctly in Cowork. Always provide at least 2 options (the tool requires a minimum of 2). Valid --stage values: pre-seed, seed, series-a, series-b, later (hyphenated, not underscored).
Why everything upfront: Extraction sub-agents run in parallel and cannot pause to ask questions. Asking early prevents pipeline stalls.
If the founder provides files (Excel/CSV), still ask about cash balance — extraction may miss or misinterpret values, and having the founder's stated number lets the agent cross-check later.
Then create:
python3 "$SHARED_SCRIPTS/founder_context.py" init \
--company-name "Acme Corp" --stage seed --sector "B2B SaaS" \
--geography "US" --artifacts-root "$ARTIFACTS_ROOT"If the script prints a sector_type warning but exits 0, that's non-fatal — proceed without retrying. However, a null sector_type may suppress sector-specific checklist gating downstream. If you know the correct type, re-run with --sector-type (valid values: saas, ai-native, marketplace, hardware, hardware-subscription, consumer-subscription, usage-based, transactional-fintech).
Exit 2 (multiple context files): Present the list to the founder, ask which company, then re-read with --slug.
When Excel (.xlsx) or CSV files are provided, run extract_model.py directly in the main thread:
python3 "$SCRIPTS/extract_model.py" --file <path> --pretty -o "$REVIEW_DIR/model_data.json"Check the periodicity_summary and per-sheet periodicity fields. If periodicity is quarterly or annual, all flow metrics (burn, revenue, expenses — anything measured per period) must be divided by 3 or 12 respectively in the next step. Do NOT convert stock metrics (cash balance, headcount, customer count, ARR — point-in-time snapshots). If periodicity is unknown, flag it.
When documents (PDFs, data room dumps, Google Sheets exports) are provided: Extract what you can directly from the documents, consulting $REFS/schema-inputs.md for the schema and $REFS/data-sufficiency.md for sufficiency assessment. Write a provisional inputs.json.
When conversational input is provided (no files): Gather all needed fields within Step 1 through normal conversation. Consult references/schema-inputs.md for the full schema.
When a sub-agent returns JSON too large for bash heredoc, write it to
$REVIEW_DIR/.staging/<step>_input.json first, then pipe via:
cat "$REVIEW_DIR/.staging/<step>_input.json" | python3 "$SCRIPTS/<producer>.py" ...The .staging/ directory is created at setup and removed at cleanup.
This avoids Operation not permitted errors that occur when writing to
the session outputs mount (Cowork marks it read-only post-write).
Dispatch the financial-model-review sub-agent in Context A (INPUTS_REVIEW). Dispatch via the Task tool. This is the highest context-pressure dispatch — the sub-agent reads the full model_data.json (40-60 KB) inside its own context window and returns only the corrected inputs.json. This is the primary Mitigation 1 win: the raw extraction output never accumulates in the main thread context.
Dispatch prompt template:
CONTEXT: INPUTS_REVIEW
REVIEW_DIR: <absolute path to REVIEW_DIR>
RUN_ID: <RUN_ID>
You are the financial-model-review agent dispatched in Context A (INPUTS_REVIEW).
Read model_data.json at <REVIEW_DIR>/model_data.json (the full extraction output).
Also read:
- ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/references/schema-inputs.md
- ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/references/extraction-pitfalls.md
- ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/references/data-sufficiency.md
Construct a complete, valid inputs.json from the extracted data. Apply all
extraction pitfall checks (scale denomination, ARPU sanity, periodicity
conversion, company name sourcing, payroll aggregation, collections vs revenue).
ARPU sanity check: if drivers.arpu_monthly or unit_economics.ltv.inputs.arpu_monthly
exceeds total MRR, it is probably aggregate revenue, not per-customer ARPU —
divide by customer count.
Return JSON only. Shape (do NOT include a "changes" or "base_hash" key — those
belong to the founder browser round-trip, not this dispatch):
{
"corrected": {<full validated inputs.json contents per schema-inputs.md,
including "metadata": {"run_id": "<RUN_ID>"}>},
"corrections": [
{"path": "cash.current_balance", "old": null, "new": 1500000,
"reason": "<where the value came from / what was fixed>"}
]
}
The "corrections" array is the audit trail written to extraction_corrections.json.
After the sub-agent returns: apply the tolerant JSON extraction protocol (see "Skill Execution Model" preamble) to obtain the structured JSON.
INPUTS_REVIEW special handling — file-args-based script: Unlike other dispatch points, apply_corrections.py takes file arguments, not stdin. The main thread must:
- Write the sub-agent's returned JSON to a temp file:
cat <<'CORRECTIONS_EOF' > "$REVIEW_DIR/corrections_from_agent.json" <JSON extracted from sub-agent reply> CORRECTIONS_EOF
- If
inputs.jsondoes not yet exist, write an empty inputs stub first:echo '{}' > "$REVIEW_DIR/inputs.json"
- Run
apply_corrections.pywith file arguments:python3 "$SCRIPTS/apply_corrections.py" "$REVIEW_DIR/corrections_from_agent.json" \ --original "$REVIEW_DIR/inputs.json" \ --output-dir "$REVIEW_DIR"
apply_corrections.pyprints aWarning: legacy payload formatline to stderr forcorrected-shaped payloads — that is expected, not an error. Read the stdout JSON:- If
status == "completed": promotecorrected_inputs.jsontoinputs.json:mv "$REVIEW_DIR/corrected_inputs.json" "$REVIEW_DIR/inputs.json"
- If
status == "error"(coercion or time-series validation failed): read theerrorsarray, fix the offending fields in the sub-agent'scorrectedobject yourself, re-writecorrections_from_agent.json, and re-run step 3. Only as a last resort writeinputs.jsondirectly fromcorrected— Step 3.5's validate_inputs gate must then catch what coercion would have.
- If
Run the validation script:
cat "$REVIEW_DIR/inputs.json" | python3 "$SCRIPTS/validate_inputs.py" --prettyIf valid == false (errors present), run with --fix to auto-correct fixable issues:
python3 "$SCRIPTS/validate_inputs.py" --fix < "$REVIEW_DIR/inputs.json" > "$REVIEW_DIR/inputs_fixed.json" && mv "$REVIEW_DIR/inputs_fixed.json" "$REVIEW_DIR/inputs.json"Then re-validate. If errors persist after --fix, correct inputs.json manually.
Also run the extraction validation script to cross-reference model_data.json against inputs.json (if model_data.json exists):
python3 "$SCRIPTS/validate_extraction.py" --inputs "$REVIEW_DIR/inputs.json" --model-data "$REVIEW_DIR/model_data.json" --fix --pretty -o "$REVIEW_DIR/extraction_validation.json"Do NOT proceed to Step 4 until valid == true and has_critical_warnings == false.
Path A — File extraction (model_format is spreadsheet or partial):
Generate the HTML review page for the founder to inspect extracted values. In Cowork (VM, no display), use static mode:
python3 "$SCRIPTS/review_inputs.py" "$REVIEW_DIR/inputs.json" --static "$REVIEW_DIR/review.html" --extraction-warnings "$REVIEW_DIR/extraction_validation.json"Tell the founder to open the review page, edit anything wrong, click Submit to download a corrections file, then upload it back. When they upload corrections.json:
python3 "$SCRIPTS/apply_corrections.py" <uploaded-file> --original "$REVIEW_DIR/inputs.json" --output-dir "$REVIEW_DIR"In Claude Code (local terminal), use server mode:
python3 "$SCRIPTS/review_inputs.py" "$REVIEW_DIR/inputs.json" --workspace "$REVIEW_DIR" --extraction-warnings "$REVIEW_DIR/extraction_validation.json" &Wait for the founder to say done, then kill the server and apply corrections.
Path B — Conversational (model_format is conversational or deck): present a confirmation table (stage, MRR, growth rate, burn, cash, customers, CAC, target raise) and use AskUserQuestion to confirm.
Dispatch prompt template:
CONTEXT: CHECKLIST
REVIEW_DIR: <absolute path to REVIEW_DIR>
RUN_ID: <RUN_ID>
You are the financial-model-review agent dispatched in Context A (CHECKLIST).
Read inputs.json at <REVIEW_DIR>/inputs.json.
Also read ${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/references/checklist-criteria.md.
Assess all 46 checklist items (STRUCT_01..09, UNIT_10..19, CASH_20..32,
METRIC_33..35, BRIDGE_36..38, SECTOR_39..44, OVERALL_45..46).
Profile-based auto-gating applies by stage/geography/sector/model_format.
Evidence is MANDATORY for every item: every fail and warn MUST have a non-empty
evidence string citing specific values from the model. Every pass MUST have
evidence noting what was checked.
Return JSON only — company + metadata + items (producer script computes summary):
{
"company": {<the company object copied verbatim from inputs.json — enables profile auto-gating>},
"metadata": {"run_id": "<RUN_ID>"},
"items": [{"id": "STRUCT_01", "status": "pass", "evidence": "...", "notes": null}, ...all 46 items...]
}
After the sub-agent returns: apply the tolerant JSON extraction protocol. Pipe through the producer script:
cat <<'CHECKLIST_EOF' | python3 "$SCRIPTS/checklist.py" --pretty -o "$REVIEW_DIR/checklist.json"
<JSON extracted from sub-agent reply>
CHECKLIST_EOFThese two producers consume inputs.json verbatim. Run them directly from the
on-disk file — do NOT round-trip the JSON through a sub-agent (an LLM re-typing
multi-KB financial JSON risks silently corrupting numbers, and it saves no
context since the JSON would land in the main thread anyway):
SCRIPTS="${CLAUDE_PLUGIN_ROOT}/skills/financial-model-review/scripts"
cat "$REVIEW_DIR/inputs.json" | python3 "$SCRIPTS/unit_economics.py" --pretty -o "$REVIEW_DIR/unit_economics.json"
cat "$REVIEW_DIR/inputs.json" | python3 "$SCRIPTS/runway.py" --pretty -o "$REVIEW_DIR/runway.json"Both scripts propagate metadata.run_id from inputs.json into their outputs
(required by the Context B run_id-parity check). All metric fields are optional —
missing data yields not_rated / a partial-analysis stub, never a crash.
python3 "$SCRIPTS/compose_report.py" --dir "$REVIEW_DIR" --pretty \
-o "$REVIEW_DIR/report.json" \
--write-md "$REVIEW_DIR/report.md"compose_report.py writes both report.json and report.md deterministically. Do NOT read report_markdown out of report.json and re-write it via heredoc.
Check validation.warnings: fix high-severity (corrupt/missing artifacts), present medium-severity (checklist failures, runway inconsistencies, metrics gaps) in the report, note low/info. --strict only blocks on high-severity warnings. Fix high-severity warnings, re-deposit, re-compose.
Post-write verification: compose_report.py exits non-zero (code 2) if the declared output files don't exist or are empty after writing. If compose exits non-zero, stop and report the exact stderr — do not proceed.
python3 "$SCRIPTS/verify_review.py" --dir "$REVIEW_DIR" --gate 1 --prettyIf exit code is non-zero: read summary.errors. Fix the issue by re-running the failing step, then re-run verify_review.py --gate 1. Do not proceed until it exits 0.
verify_review.py --gate 2 requires commentary.json whenever unit_economics.json
and runway.json are real (non-stub) — and explore.py embeds it into the
interactive explorer. Author it now, in the main thread, from the artifacts you
have already seen (checklist summary, unit-economics ratings, runway scenarios).
Schema: $REFS/artifact-schemas.md § commentary.json. headline is required;
include only the lens keys whose artifacts exist (valid lens keys: runway,
unit_economics, stress_test, raise_planner).
cat > "$REVIEW_DIR/commentary.json" <<'COMMENTARY_EOF'
{
"headline": "<one-sentence financial health summary>",
"investor_talking_points": [
"<sentence the founder can say out loud during a fundraise conversation>"
],
"lenses": {
"runway": {"callout": "<key insight>", "highlight": "<secondary observation>", "watch_out": "<risk>"},
"unit_economics": {"callout": "<key insight>", "watch_out": "<risk>"}
}
}
COMMENTARY_EOFGround every sentence in artifact values — never invent numbers. If both
unit_economics.json and runway.json are skipped stubs (qualitative path),
skip this step; Gate 2 will not require the file.
python3 "$SCRIPTS/visualize.py" --dir "$REVIEW_DIR" -o "$REVIEW_DIR/report.html"
python3 "$SCRIPTS/explore.py" --dir "$REVIEW_DIR" -o "$REVIEW_DIR/explore.html"Generate files silently — present paths after Gate 2 passes.
Dispatch the financial-model-review sub-agent in Context B. Dispatch via the Task tool after compose_report.py has successfully written both report.json and report.md.
Mitigation 2 protocol: the main thread reads the structured coaching_payload from report.json and inlines it into the dispatch prompt. The sub-agent does NOT Read full report.md — it consumes coaching_payload directly, performs Grep idempotency, Edits via the per-run uuid insertion_marker, and Grep-verifies all artifacts. See the financial-model-review agent body's "Context B — Post-compose coaching dispatch (POST_COMPOSE_COACHING)" section for the full procedure.
python3 -c '
import json, sys
data = json.load(open(sys.argv[1]))
print(json.dumps(data["coaching_payload"], indent=2))
' "$REVIEW_DIR/report.json"The payload prints to stdout — copy it from the tool result into the dispatch prompt below. (Never capture it into a shell variable: each Bash call runs in a fresh shell, so the variable would be unreadable and gone.)
Dispatch prompt template:
CONTEXT: POST_COMPOSE_COACHING
You are dispatched to add coaching commentary to a financial model review.
The compose_report.py script has finished. The structured `coaching_payload`
from report.json is:
<paste the coaching_payload JSON printed by the previous Bash command here verbatim>
Follow your agent body's Context B procedure
(POST_COMPOSE_COACHING):
1. grep_idempotency_check — Grep "## Coaching Commentary" (output_mode:count)
and Grep the EXACT coaching_payload.insertion_marker (output_mode:count)
on coaching_payload.report_path. Apply the 6-state decision matrix.
2. Compose commentary from the inlined coaching_payload (failed_items,
warned_items, summary, high_severity_warnings, company_name).
If truncated:true, acknowledge that not all failures are shown.
Do NOT Read the full report.md.
3. edit_via_marker — single Edit on coaching_payload.report_path:
old_string = coaching_payload.insertion_marker (EXACT uuid string)
new_string = "## Coaching Commentary\n\n<your commentary>"
4. self_verify_artifacts_via_grep_run_id — Grep run_id from each producer
artifact (inputs.json, checklist.json, unit_economics.json, runway.json),
confirm all 4 match; bounded Read (limit:1) on report.json and report.md;
re-Grep the marker (must be 0) and the "## Coaching Commentary" header
(must be 1).
5. Return the success payload:
{"status": "complete", "review_dir": "<path>", "report_path": "<path>",
"runway_months": <number or null>, "overall_status": "<status>",
"red_flags": [<list>], "high_severity_warnings": [<list>]}
OR if verification fails:
{"status": "blocked", "reason": "<specific gap>"}
Stop after returning JSON. Do not narrate.
After the sub-agent returns: apply the tolerant JSON extraction protocol to obtain the success/blocked payload. If status == "blocked", stop and report the reason to the founder. If status == "complete", proceed to Verification Gate 2.
rm -rf "$REVIEW_DIR/.staging" 2>/dev/null || truepython3 "$SCRIPTS/verify_review.py" --dir "$REVIEW_DIR" --prettyThis is the final quality gate. If it exits non-zero, fix the issues before presenting anything to the founder. Once it passes, present everything to the founder:
- Present
$REVIEW_DIR/report.md— the primary deliverable (do NOT inline the markdown in the assistant message; present the file path) - Present the
report.htmlfile path - Present the
explore.htmlfile path
Do NOT inline report_markdown in the assistant message. The founder reads the file via the path. (Closing the ~80-130K context accumulation issue.)
This skill runs inline in the main thread (not as a sub-agent). The final outcome the main thread delivers to the founder is:
- The path to
$REVIEW_DIR/report.md— the primary deliverable. - The structured success payload from the Context B sub-agent:
{status, review_dir, report_path, runway_months, overall_status, red_flags, high_severity_warnings}. - Optionally: the HTML report and explorer paths.
- Each of 46 items: pass / fail / warn / not_applicable
score_pct= (pass + 0.5 * warn) / (total - not_applicable) * 100- Overall: "strong" (>=85%), "solid" (>=70%), "needs_work" (>=50%), "major_revision" (<50%)