Skip to content

Commit 855e459

Browse files
author
terp
committed
feat: add Auto Router for per-task model selection
Adds an optional "Auto (smart routing)" picker entry (slug codex-auto) that routes each Codex task to the cheapest configured model that can handle it. A cheap classifier model scores every candidate 0.0-1.0 from a capability card; the shim picks the cheapest candidate whose score clears a threshold (default 0.7), caches the decision per task, and falls back safely on any error so a request never breaks. The classifier never sees price, so it can't be biased toward expensive models. - codex_shim/router.py: config loading, task-signal extraction, classifier prompt, score parsing, cheapest-among-viable selection, per-task cache. - server.py: applies the router on /v1/responses, /v1/responses/compact, and /v1/chat/completions; runs the classifier over the configured backend; gates the virtual model in /v1/models, /api/models, /health, and the picker. - catalog.py + cli.py: catalog entry and `codex-shim list`/`model use` support. - Configured via an optional `router` block in ~/.codex-shim/models.json. Env knobs: CODEX_SHIM_DISABLE_ROUTER, CODEX_SHIM_ROUTER_TIMEOUT, CODEX_SHIM_ROUTER_MAX_TOKENS, CODEX_SHIM_ROUTER_LOG. - docs/AUTO_ROUTER.md, README section, and a runnable offline proof at examples/auto_router_demo.py. Verification: - python3 -m pytest tests/ -q (113 passed) - python3 -m compileall codex_shim/ examples/ -q - python3 examples/auto_router_demo.py (RESULT: PASS) Generated with [Devin](https://cli.devin.ai/docs)
1 parent 14a05ba commit 855e459

10 files changed

Lines changed: 1572 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,18 @@ and this project does not yet follow semantic versioning (pre-1.0).
99

1010
### Added
1111

12+
- Auto Router (`codex_shim/router.py`): an optional `Auto (smart routing)` picker
13+
entry (slug `codex-auto`) that routes each task to the cheapest configured
14+
model that can handle it. A cheap classifier model scores every candidate
15+
`0.0–1.0` from a capability card, the shim picks the cheapest candidate whose
16+
score clears `threshold` (default `0.7`), caches the decision per task, and
17+
falls back safely on any error. Configured via an optional `router` block in
18+
`~/.codex-shim/models.json`; gated in `/health`, `/v1/models`, `/api/models`,
19+
the generated catalog, and `codex-shim list`. Env knobs:
20+
`CODEX_SHIM_DISABLE_ROUTER`, `CODEX_SHIM_ROUTER_TIMEOUT`,
21+
`CODEX_SHIM_ROUTER_MAX_TOKENS`, `CODEX_SHIM_ROUTER_LOG`. Documented in
22+
`docs/AUTO_ROUTER.md` with a runnable offline proof at
23+
`examples/auto_router_demo.py`.
1224
- Cursor/Composer subscription passthrough for slug `composer-2-5`. When
1325
`cursor-agent login` is active, the shim spawns `cursor-agent --print` with
1426
CLI OAuth (no Dashboard API key). The slug is auth-gated in `/health`,

README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ local:
3939
exposes `composer-2-5` and routes through your Cursor subscription — no
4040
Dashboard API key (`crsr_…`) required. See
4141
[`docs/subscription-integration.md`](docs/subscription-integration.md).
42+
- **Auto Router (optional).** Add an `Auto (smart routing)` picker entry that
43+
uses a cheap classifier model to route each task to the cheapest configured
44+
model that can handle it — trivial turns stay cheap, hard turns escalate. See
45+
[`docs/AUTO_ROUTER.md`](docs/AUTO_ROUTER.md).
4246
- **Prompt-catching/proxy-friendly architecture.** Put a local proxy in front
4347
of the shim to dedupe boilerplate, inject stable instructions, repair
4448
pseudo-tool text, or route prompts by policy before they hit an upstream.
@@ -564,6 +568,53 @@ GLM, etc.) round-trip through `reasoning.encrypted_content` items.
564568

565569
---
566570

571+
## Auto Router (smart routing)
572+
573+
Optionally add one extra picker entry — **`Auto (smart routing)`** (slug
574+
`codex-auto`) — that chooses the right model *per task*: trivial turns go to a
575+
cheap model, hard turns escalate to your strongest one. It runs entirely on the
576+
models you already configure.
577+
578+
On each new task the shim asks a cheap **classifier** model you nominate to score
579+
every candidate `0.0–1.0` (how likely it nails the task first try), reading a
580+
short **capability card** per candidate. It then routes to the **cheapest
581+
candidate whose score clears `threshold`** (default `0.7`), caches that decision
582+
for the task's tool-call round-trips, and falls back safely on any error. The
583+
classifier never sees price, so it can't be biased toward expensive models.
584+
585+
Turn it on by adding a `router` block to `~/.codex-shim/models.json`:
586+
587+
```jsonc
588+
"router": {
589+
"enabled": true,
590+
"slug": "codex-auto",
591+
"classifier": "minimax-m3", // slug of a cheap configured model
592+
"threshold": 0.7,
593+
"default": "minimax-m3",
594+
"cache": true,
595+
"candidates": [
596+
{ "slug": "minimax-m3", "cost": 0.3, "supports_images": false,
597+
"card": "Cheap, fast. Single-file edits, codegen, simple refactors." },
598+
{ "slug": "opus", "cost": 5.0, "supports_images": true,
599+
"card": "Frontier. Big multi-file refactors, hard debugging, images." }
600+
]
601+
}
602+
```
603+
604+
Prove it end to end with no keys and no network:
605+
606+
```bash
607+
python3 examples/auto_router_demo.py
608+
```
609+
610+
It spins up a mock multi-backend server, starts the **real** shim with the router
611+
on, and shows trivial→cheap, medium→mid, hard→strong, image→image-capable, and a
612+
repeat served from cache. Full configuration, env knobs (`CODEX_SHIM_ROUTER_LOG`,
613+
`CODEX_SHIM_DISABLE_ROUTER`, …), and failure behavior are in
614+
[`docs/AUTO_ROUTER.md`](docs/AUTO_ROUTER.md).
615+
616+
---
617+
567618
## Tool calls and agent loops
568619

569620
Codex expects Responses-API output items. Most BYOK upstreams speak either

codex_shim/catalog.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,12 @@
33
import json
44
from pathlib import Path
55

6+
from . import router as router_module
67
from .settings import (
78
CHATGPT_MODEL_SLUG,
89
PROVIDER_NAME,
910
ShimModel,
11+
available_model_slugs,
1012
chatgpt_passthrough_available,
1113
default_model_slug,
1214
load_chatgpt_passthrough_catalog_models,
@@ -94,9 +96,11 @@ def chatgpt_passthrough_entry() -> dict:
9496
return chatgpt_passthrough_entries()[0]
9597

9698

97-
def write_catalog(models: list[ShimModel], path: Path) -> Path:
99+
def write_catalog(models: list[ShimModel], path: Path, router_config=None) -> Path:
98100
path.parent.mkdir(parents=True, exist_ok=True)
99101
entries: list[dict] = []
102+
if router_config is not None and router_module.router_is_active(router_config, available_model_slugs(models)):
103+
entries.append(router_module.router_catalog_entry(router_config))
100104
if chatgpt_passthrough_available():
101105
entries.extend(chatgpt_passthrough_entries())
102106
if cursor_passthrough_available():

codex_shim/cli.py

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import struct
1515
from urllib.request import urlopen
1616

17+
from . import router as router_module
1718
from .catalog import _toml_escape, codex_config_overrides, write_catalog, write_config
1819
from .cursor_passthrough import (
1920
cursor_passthrough_available,
@@ -27,6 +28,7 @@
2728
DEFAULT_PORT,
2829
PROVIDER_NAME,
2930
ModelSettings,
31+
available_model_slugs,
3032
chatgpt_passthrough_available,
3133
chatgpt_passthrough_display_names,
3234
chatgpt_passthrough_slugs,
@@ -159,23 +161,35 @@ def _load_models(settings_path: Path):
159161
raise SystemExit(f"Settings file is not valid JSON: {expanded}: {exc}") from exc
160162

161163

164+
def _active_router(models, settings_path: Path):
165+
"""RouterConfig when the Auto Router is enabled and has a usable candidate."""
166+
config = router_module.load_router_config(Path(settings_path).expanduser())
167+
if config and router_module.router_is_active(config, available_model_slugs(models)):
168+
return config
169+
return None
170+
171+
162172
def generate(settings_path: Path, port: int) -> None:
163173
models = _load_models(settings_path)
164174
try:
165175
default_model_slug(models)
166176
except ValueError as exc:
167177
raise SystemExit(str(exc)) from exc
168-
write_catalog(models, CATALOG_PATH)
178+
router_config = router_module.load_router_config(Path(settings_path).expanduser())
179+
write_catalog(models, CATALOG_PATH, router_config=router_config)
169180
write_config(models, CONFIG_PATH, CATALOG_PATH, port)
170181
print(f"Generated {len(models)} model entries:")
182+
if _active_router(models, settings_path) is not None:
183+
print(f" auto router: {router_config.slug} ({router_config.display_name})")
171184
print(f" catalog: {CATALOG_PATH}")
172185
print(f" config: {CONFIG_PATH}")
173186
print("No files under ~/.codex were modified.")
174187

175188

176189
def install_codex_config(settings_path: Path, port: int, model_slug: str | None = None) -> None:
177190
models = _load_models(settings_path)
178-
default_slug = _resolve_model_slug(models, model_slug)
191+
router_config = _active_router(models, settings_path)
192+
default_slug = _resolve_model_slug(models, model_slug, router_config)
179193
CODEX_CONFIG_PATH.parent.mkdir(parents=True, exist_ok=True)
180194
RUNTIME_DIR.mkdir(parents=True, exist_ok=True)
181195
original = CODEX_CONFIG_PATH.read_text() if CODEX_CONFIG_PATH.exists() else ""
@@ -189,7 +203,7 @@ def install_codex_config(settings_path: Path, port: int, model_slug: str | None
189203
previous_top_level = _extract_top_level_key_lines(CODEX_CONFIG_BACKUP_PATH.read_text(), MANAGED_TOP_LEVEL_KEYS)
190204
cleaned = _remove_top_level_keys(cleaned, MANAGED_TOP_LEVEL_KEYS)
191205
cleaned = _remove_section(cleaned, f"model_providers.{PROVIDER_NAME}")
192-
provider_name = _provider_display_name(models, default_slug)
206+
provider_name = _provider_display_name(models, default_slug, router_config)
193207
top_block, provider_block = _managed_config_blocks(
194208
default_slug, port, previous_top_level, provider_name=provider_name
195209
)
@@ -200,6 +214,9 @@ def install_codex_config(settings_path: Path, port: int, model_slug: str | None
200214
def list_models(settings_path: Path) -> int:
201215
models = _load_models(settings_path)
202216
rows: list[tuple[str, str, str, str]] = []
217+
router_config = _active_router(models, settings_path)
218+
if router_config is not None:
219+
rows.append((router_config.slug, router_config.display_name, "per-task pick", "auto"))
203220
if chatgpt_passthrough_available():
204221
for slug, display_name in chatgpt_passthrough_display_names().items():
205222
rows.append((slug, display_name, slug, "chatgpt"))
@@ -611,7 +628,9 @@ def _foreground_codex_app() -> None:
611628
pass
612629

613630

614-
def _provider_display_name(models, slug: str) -> str:
631+
def _provider_display_name(models, slug: str, router_config=None) -> str:
632+
if router_config is not None and slug == router_config.slug:
633+
return router_config.display_name
615634
if chatgpt_passthrough_available():
616635
display_name = chatgpt_passthrough_display_names().get(slug)
617636
if display_name:
@@ -783,15 +802,17 @@ def _override_args(settings_path: Path, port: int) -> list[str]:
783802
return args
784803

785804

786-
def _resolve_model_slug(models, requested: str | None) -> str:
805+
def _resolve_model_slug(models, requested: str | None, router_config=None) -> str:
787806
if requested is None:
788807
current = _current_managed_model()
789-
if current in _valid_model_slugs(models):
808+
if current in _valid_model_slugs(models, router_config):
790809
return current
791810
try:
792811
return default_model_slug(models)
793812
except ValueError as exc:
794813
raise SystemExit(str(exc)) from exc
814+
if router_config is not None and requested == router_config.slug:
815+
return requested
795816
if is_chatgpt_passthrough_slug(requested):
796817
if not chatgpt_passthrough_available():
797818
raise SystemExit(
@@ -853,8 +874,10 @@ def _current_managed_model() -> str | None:
853874
return None
854875

855876

856-
def _valid_model_slugs(models) -> set[str]:
877+
def _valid_model_slugs(models, router_config=None) -> set[str]:
857878
slugs = {model.slug for model in usable_byok_models(models)}
879+
if router_config is not None:
880+
slugs.add(router_config.slug)
858881
if chatgpt_passthrough_available():
859882
slugs.update(chatgpt_passthrough_slugs())
860883
if cursor_passthrough_available():

0 commit comments

Comments
 (0)