Skip to content

feat(platform): Add LLM registry core - DB layer + in-memory cache#12359

Draft
Bentlybro wants to merge 10 commits into
feat/llm-registry-schemafrom
feat/llm-registry-core
Draft

feat(platform): Add LLM registry core - DB layer + in-memory cache#12359
Bentlybro wants to merge 10 commits into
feat/llm-registry-schemafrom
feat/llm-registry-core

Conversation

@Bentlybro

@Bentlybro Bentlybro commented Mar 10, 2026

Copy link
Copy Markdown
Member

Summary

Add LLM registry core implementation - Part 2 of 3 in incremental rollout.

Builds on PR #12357 (schema foundation) to provide database access layer and in-memory caching for dynamic LLM model management.

Changes

Registry Core (`backend/data/llm_registry/`)

  • DB access layer - Prisma queries with full relation loading (Provider, Costs, Creator)
  • In-memory cache - Asyncio-locked singleton with atomic refresh
  • Public API - `get_model()`, `get_all_models()`, `get_enabled_models()`, `get_schema_options()`, `get_default_model_slug()`
  • Dataclasses - `RegistryModel`, `RegistryModelCost`, `RegistryModelCreator` (immutable, frozen)
  • ModelMetadata - Re-exports from `blocks.llm` to avoid type collision

Startup Integration

  • Registry refresh before block initialization
  • Graceful fallback if refresh fails (no blocks consume registry yet - comes in future PR)

Review Feedback Addressed

  • ✅ Fixed ModelMetadata duplicate type collision (now imports from blocks.llm)
  • ✅ Removed `_json_to_dict()` helper - use `dict(value or {})` inline
  • ✅ Added warning when Provider relation is missing (data corruption indicator)
  • ✅ Optimized `get_default_model_slug()` - single sort pass with `next()`
  • ✅ Optimized `_build_schema_options()` - list comprehension instead of loop
  • ✅ Moved `llm_registry` import to top-level in rest_api.py
  • ✅ Explicit `max_output_tokens` fallback to `context_window` when null
  • ✅ Rebased onto updated schema (PR feat(platform): Add LLM registry database schema and seed data #12357) with FK constraints

Known Limitations (Deferred)

  • Multi-worker coordination - asyncio.Lock only works per-process (Redis cache in future PR)
  • Mutable cache exposure - Internal dicts accessible (acceptable for now, Redis will fix)

Testing

  • Prisma client generates successfully
  • All imports resolve correctly
  • Registry refresh loads models from DB
  • Cache atomic swap works correctly
  • Backend starts successfully with registry

Stacked PRs

@Bentlybro Bentlybro requested a review from a team as a code owner March 10, 2026 18:14
@Bentlybro Bentlybro requested review from 0ubbe and majdyz and removed request for a team March 10, 2026 18:14
@github-project-automation github-project-automation Bot moved this to 🆕 Needs initial review in AutoGPT development kanban Mar 10, 2026
@github-actions github-actions Bot added platform/backend AutoGPT Platform - Back end size/xl labels Mar 10, 2026
@coderabbitai

coderabbitai Bot commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 55f2e023-e31b-4cce-b8da-33f8559b83d6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Adds a new LLM registry: database schema (Prisma + SQL migration), Python model and registry implementation with in-memory caching and async refresh, package exports, and a startup-time registry refresh call before block initialization.

Changes

Cohort / File(s) Summary
Database schema & migration
autogpt_platform/backend/schema.prisma, autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql
Adds LLM registry schema: LlmCostUnit enum and models LlmProvider, LlmModel, LlmModelCost, LlmModelCreator, LlmModelMigration with indexes, constraints, and foreign keys.
Registry implementation
autogpt_platform/backend/backend/data/llm_registry/registry.py
New in-memory registry with data classes (RegistryModel, RegistryModelCost, RegistryModelCreator), async refresh_llm_registry() that queries DB, builds objects, atomically swaps cache, and exposes query helpers (get_model, get_all_models, get_enabled_models, get_schema_options, get_default_model_slug, get_all_model_slugs_for_validation).
Model types
autogpt_platform/backend/backend/data/llm_registry/model.py
Adds ModelMetadata NamedTuple describing model metadata fields (provider, context_window, max_output_tokens, display_name, provider_name, creator_name, price_tier).
Package exports
autogpt_platform/backend/backend/data/llm_registry/__init__.py
New package initializer re-exporting model types, registry classes, functions, and refresh_llm_registry in __all__.
Startup integration
autogpt_platform/backend/backend/api/rest_api.py
Calls refresh_llm_registry() during application startup before initializing blocks; logs success or warns on failure and continues startup.

Sequence Diagram

sequenceDiagram
    participant App as Application Startup
    participant Registry as LLM Registry
    participant DB as Database
    participant Blocks as Block Initialization

    App->>Registry: refresh_llm_registry()
    activate Registry
    Registry->>DB: Query LlmModel with relations (Provider, Creator, Costs)
    DB-->>Registry: Model records
    Registry->>Registry: Build RegistryModel objects and costs/creator metadata
    Registry->>Registry: Atomically swap _dynamic_models and _schema_options
    Registry-->>App: Refresh complete (or error logged)
    deactivate Registry

    App->>Blocks: initialize_blocks()
    activate Blocks
    Blocks->>Registry: get_enabled_models()/get_schema_options()
    Registry-->>Blocks: Cached registry data
    Blocks-->>App: Initialization complete
    deactivate Blocks
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

Review effort 4/5

Suggested reviewers

  • 0ubbe
  • majdyz

Poem

🐰 A registry hops in with models to share,
Costs and creators all handled with care,
Prisma sows rows, async refresh wakes,
Cache swaps at dawn, while startup bakes,
Blocks fetch the bloom — a tidy affair.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately summarizes the main change: adding the LLM registry core with database layer and in-memory cache functionality.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description is directly related to the changeset, providing clear context about the LLM registry implementation with specific details on DB layer, in-memory cache, public API, data models, and integration points.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/llm-registry-core

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

🔍 PR Overlap Detection

This check compares your PR against all other open PRs targeting the same branch to detect potential merge conflicts early.

🔴 Merge Conflicts Detected

The following PRs have been tested and will have merge conflicts if merged after this PR. Consider coordinating with the authors.

🟡 Medium Risk — Some Line Overlap

These PRs have some overlapping changes:

  • feat(platform): Add LLM registry database schema and seed data #12357 (Bentlybro · updated 3m ago)

    • autogpt_platform/backend/migrations/20260310_seed_llm_registry/migration.sql: L1-260
    • autogpt_platform/backend/schema.prisma: L1301-1464
    • autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql: L1-148
  • feat(frontend): Add LLM registry admin UI #12468 (Bentlybro · updated 2h ago)

    • autogpt_platform/backend/backend/data/llm_registry/registry.py: L1-240
    • autogpt_platform/backend/backend/data/llm_registry/model.py: L1-9
    • autogpt_platform/backend/schema.prisma: L1301-1464
    • autogpt_platform/backend/backend/data/llm_registry/__init__.py: L1-31
    • autogpt_platform/backend/backend/api/rest_api.py: L37-43, L117-147
    • autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql: L1-148
  • feat(platform): Add LLM registry admin write API #12467 (Bentlybro · updated 1d ago)

    • autogpt_platform/backend/backend/data/llm_registry/registry.py: L1-240
    • autogpt_platform/backend/backend/data/llm_registry/model.py: L1-9
    • autogpt_platform/backend/schema.prisma: L1301-1464
    • autogpt_platform/backend/backend/data/llm_registry/__init__.py: L1-31
    • autogpt_platform/backend/backend/api/rest_api.py: L37-43, L117-147
    • autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql: L1-148

🟢 Low Risk — File Overlap Only

These PRs touch the same files but different sections (click to expand)

Summary: 3 conflict(s), 3 medium risk, 4 low risk (out of 10 PRs with file overlap)


Auto-generated on push. Ignores: openapi.json, lock files.

Comment thread autogpt_platform/backend/backend/data/llm_registry/registry.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
autogpt_platform/backend/backend/data/llm_registry/registry.py (1)

207-217: Make recommended-model selection deterministic.

This branch walks _dynamic_models in database insertion order. Because find_many() has no explicit ordering and the schema allows more than one recommended row, the default slug can flap between refreshes. Sort the recommended subset the same way as the fallback, or enforce uniqueness in the schema.

One simple way to stabilize the result
 def get_default_model_slug() -> str | None:
     """Get the default model slug (first recommended, or first enabled)."""
     # Prefer recommended models
-    for model in _dynamic_models.values():
+    for model in sorted(
+        _dynamic_models.values(), key=lambda m: m.display_name.lower()
+    ):
         if model.is_recommended and model.is_enabled:
             return model.slug
 
     # Fallback to first enabled model
-    for model in sorted(_dynamic_models.values(), key=lambda m: m.display_name):
+    for model in sorted(
+        _dynamic_models.values(), key=lambda m: m.display_name.lower()
+    ):
         if model.is_enabled:
             return model.slug
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/data/llm_registry/registry.py` around lines
207 - 217, The get_default_model_slug function picks a recommended model by
iterating _dynamic_models in insertion order which can yield nondeterministic
results; change the first loop to iterate over sorted(_dynamic_models.values(),
key=lambda m: m.display_name) and check both model.is_recommended and
model.is_enabled (same as the fallback ordering) so recommended-model selection
is deterministic even if multiple rows are marked recommended.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@autogpt_platform/backend/backend/api/rest_api.py`:
- Around line 120-127: If refresh_llm_registry() fails, do not proceed to
initialize_blocks(); instead surface the failure by either re-raising the
exception or returning/aborting startup and mark health/readiness as degraded so
callers know initialization is incomplete. Update the try/except around
refresh_llm_registry() to stop further execution on error (referencing
refresh_llm_registry and initialize_blocks) and ensure the health/readiness
state is set to a failure/degraded status so block initialization that depends
on registry data is skipped.

In `@autogpt_platform/backend/backend/data/llm_registry/registry.py`:
- Around line 27-36: RegistryModelCost currently drops the pricing unit, making
RUN vs TOKENS indistinguishable; update the RegistryModelCost dataclass to
include a unit field (matching LlmModelCost's unit enum/string) and modify any
conversion/mapping logic that constructs RegistryModelCost from rows to populate
this unit instead of discarding it; also make the same change for the other
cached cost dataclass referenced around the duplicate block (the second cost
dataclass at lines ~97-104) so both in-memory representations preserve the
original unit and downstream billing/selection retains correct semantics.
- Around line 133-138: The ModelMetadata constructor call is using the wrong
keyword list: remove supports_vision and supply the required fields
display_name, provider_name, creator_name, and price_tier when creating
ModelMetadata in registry.py; keep the existing provider=provider_name and
context_window=record.contextWindow, set max_output_tokens to
record.maxOutputTokens or record.contextWindow, set display_name from
record.displayName (or record.name) as appropriate, set provider_name to
provider_name, set creator_name to record.creatorId (or map to null/None when
creatorId is nullable), and set price_tier from record.priceTier (handling
nullable values); ensure nullable DB fields (creatorId, maxOutputTokens,
priceTier) are handled safely (use None/defaults) so ModelMetadata
initialization matches its declared fields.

In `@autogpt_platform/backend/schema.prisma`:
- Around line 1424-1427: The two free-form fields sourceModelSlug and
targetModelSlug should be turned into foreign-key-backed relations to
LlmModel.slug to prevent migrations pointing at non-existent models: add
corresponding relation fields (e.g., sourceModel and targetModel) that reference
LlmModel with fields [sourceModelSlug] and [targetModelSlug] and references
[slug], and set onDelete to Restrict (or another explicit behavior) so deletions
or typos surface as schema errors; update any model creation/seeding code to
populate the slug fields and relation fields consistently.

---

Nitpick comments:
In `@autogpt_platform/backend/backend/data/llm_registry/registry.py`:
- Around line 207-217: The get_default_model_slug function picks a recommended
model by iterating _dynamic_models in insertion order which can yield
nondeterministic results; change the first loop to iterate over
sorted(_dynamic_models.values(), key=lambda m: m.display_name) and check both
model.is_recommended and model.is_enabled (same as the fallback ordering) so
recommended-model selection is deterministic even if multiple rows are marked
recommended.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4188793c-4a18-482b-be85-2dfc2e49f97c

📥 Commits

Reviewing files that changed from the base of the PR and between 5641cdd and 58714c8.

📒 Files selected for processing (6)
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
  • autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql
  • autogpt_platform/backend/schema.prisma
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Seer Code Review
  • GitHub Check: types
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.13)
  • GitHub Check: Analyze (python)
  • GitHub Check: Check PR Status
🧰 Additional context used
📓 Path-based instructions (8)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/backend/data/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

All data access in backend requires user ID checks; verify this for any 'data/*.py' changes

Files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/backend/api/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

autogpt_platform/backend/backend/api/**/*.py: Use FastAPI for building REST and WebSocket endpoints
Use JWT-based authentication with Supabase integration

Files:

  • autogpt_platform/backend/backend/api/rest_api.py
autogpt_platform/backend/schema.prisma

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Run database migrations with 'poetry run prisma migrate dev' and 'poetry run prisma generate' after schema changes in backend

Files:

  • autogpt_platform/backend/schema.prisma
autogpt_platform/backend/**/schema.prisma

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Define key database models (User, AgentGraph, AgentGraphExecution, AgentNode, StoreListing) in schema.prisma

Files:

  • autogpt_platform/backend/schema.prisma
🧠 Learnings (6)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/schema.prisma
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/model.py
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/__init__.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-02-04T16:50:20.508Z
Learnt from: CR
Repo: Significant-Gravitas/AutoGPT PR: 0
File: autogpt_platform/backend/CLAUDE.md:0-0
Timestamp: 2026-02-04T16:50:20.508Z
Learning: Applies to autogpt_platform/backend/**/schema.prisma : Define key database models (User, AgentGraph, AgentGraphExecution, AgentNode, StoreListing) in `schema.prisma`

Applied to files:

  • autogpt_platform/backend/migrations/20260310_add_llm_registry_schema/migration.sql
  • autogpt_platform/backend/schema.prisma
📚 Learning: 2026-03-04T23:58:18.476Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Applied to files:

  • autogpt_platform/backend/schema.prisma
📚 Learning: 2026-03-05T00:13:52.412Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/backend/schema.prisma:284-284
Timestamp: 2026-03-05T00:13:52.412Z
Learning: In `autogpt_platform/backend/schema.prisma`, the `AgentGraph` ↔ `StoreListing` relation uses the pattern: `AgentGraph` declares `StoreListing? relation(fields: [id], references: [agentGraphId], onDelete: NoAction)` and `StoreListing` declares `AgentGraph AgentGraph[]` with `agentGraphId String unique`. This is intentional and valid because `AgentGraph` has a composite PK `@id([id, version])` (multiple rows per graph id, one per version), while `StoreListing.agentGraphId` is `unique` (one listing per graph id). The `fields: [id], references: [agentGraphId]` on the `AgentGraph` side joins `AgentGraph.id` against `StoreListing.agentGraphId`. Do NOT flag this as a cardinality mismatch or malformed relation — `prisma validate` passes cleanly.

Applied to files:

  • autogpt_platform/backend/schema.prisma

Comment thread autogpt_platform/backend/backend/api/rest_api.py Outdated
Comment thread autogpt_platform/backend/backend/data/llm_registry/registry.py Outdated
Comment on lines +133 to +138
metadata = ModelMetadata(
provider=provider_name,
context_window=record.contextWindow,
max_output_tokens=record.maxOutputTokens or record.contextWindow,
supports_vision=capabilities.get("supportsVision", False),
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

python - <<'PY'
import ast
from pathlib import Path

def namedtuple_fields(path: str, class_name: str) -> list[str]:
    tree = ast.parse(Path(path).read_text())
    for node in tree.body:
        if isinstance(node, ast.ClassDef) and node.name == class_name:
            return [
                stmt.target.id
                for stmt in node.body
                if isinstance(stmt, ast.AnnAssign) and isinstance(stmt.target, ast.Name)
            ]
    raise SystemExit(f"{class_name} not found in {path}")

def call_keywords(path: str, func_name: str) -> list[list[str]]:
    tree = ast.parse(Path(path).read_text())
    matches: list[list[str]] = []
    for node in ast.walk(tree):
        if (
            isinstance(node, ast.Call)
            and isinstance(node.func, ast.Name)
            and node.func.id == func_name
        ):
            matches.append([kw.arg for kw in node.keywords if kw.arg is not None])
    return matches

fields = namedtuple_fields(
    "autogpt_platform/backend/backend/data/llm_registry/model.py",
    "ModelMetadata",
)
calls = call_keywords(
    "autogpt_platform/backend/backend/data/llm_registry/registry.py",
    "ModelMetadata",
)

if not calls:
    raise SystemExit("No ModelMetadata() call found in registry.py")

print("Declared fields:", fields)
for index, keywords in enumerate(calls, start=1):
    extra = sorted(set(keywords) - set(fields))
    missing = sorted(set(fields) - set(keywords))
    print(f"Call {index} keywords:", keywords)
    print("  extra:", extra)
    print("  missing:", missing)
    if extra or missing:
        raise SystemExit(1)
PY

Repository: Significant-Gravitas/AutoGPT

Length of output: 395


Fix the ModelMetadata constructor call to match the declared fields.

The call on lines 133-138 uses ['provider', 'context_window', 'max_output_tokens', 'supports_vision'] but the declared fields are ['provider', 'context_window', 'max_output_tokens', 'display_name', 'provider_name', 'creator_name', 'price_tier']. This causes a TypeError on the first record. Add the missing required fields (display_name, provider_name, creator_name, price_tier) and remove the unsupported supports_vision keyword. Handle nullable fields from the database schema appropriately (e.g., creatorId and maxOutputTokens may be null).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/data/llm_registry/registry.py` around lines
133 - 138, The ModelMetadata constructor call is using the wrong keyword list:
remove supports_vision and supply the required fields display_name,
provider_name, creator_name, and price_tier when creating ModelMetadata in
registry.py; keep the existing provider=provider_name and
context_window=record.contextWindow, set max_output_tokens to
record.maxOutputTokens or record.contextWindow, set display_name from
record.displayName (or record.name) as appropriate, set provider_name to
provider_name, set creator_name to record.creatorId (or map to null/None when
creatorId is nullable), and set price_tier from record.priceTier (handling
nullable values); ensure nullable DB fields (creatorId, maxOutputTokens,
priceTier) are handled safely (use None/defaults) so ModelMetadata
initialization matches its declared fields.

Comment on lines +1424 to +1427
sourceModelSlug String // The original model that was disabled
targetModelSlug String // The model workflows were migrated to
reason String? // Why the migration happened (e.g., "Provider outage")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add foreign-key protection for model migrations.

sourceModelSlug and targetModelSlug are the identifiers the migration system will act on, but they're just free-form strings here. A typo in seed/admin data can create an active migration that points at a non-existent model and the failure won't surface until runtime. These should be modeled as relations, or at least foreign keys, to LlmModel.slug.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/schema.prisma` around lines 1424 - 1427, The two
free-form fields sourceModelSlug and targetModelSlug should be turned into
foreign-key-backed relations to LlmModel.slug to prevent migrations pointing at
non-existent models: add corresponding relation fields (e.g., sourceModel and
targetModel) that reference LlmModel with fields [sourceModelSlug] and
[targetModelSlug] and references [slug], and set onDelete to Restrict (or
another explicit behavior) so deletions or typos surface as schema errors;
update any model creation/seeding code to populate the slug fields and relation
fields consistently.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
autogpt_platform/backend/backend/data/llm_registry/registry.py (1)

143-151: ⚠️ Potential issue | 🟠 Major

Fallback max_output_tokens when the DB value is null.

Line 146 currently passes the nullable DB value straight through. For providers/models that legitimately persist maxOutputTokens = NULL, this leaves ModelMetadata.max_output_tokens unset instead of using the effective limit the rest of the codebase expects.

🔧 Proposed fix
                 metadata = ModelMetadata(
                     provider=provider_name,
                     context_window=record.contextWindow,
-                    max_output_tokens=record.maxOutputTokens,
+                    max_output_tokens=(
+                        record.maxOutputTokens
+                        if record.maxOutputTokens is not None
+                        else record.contextWindow
+                    ),
                     display_name=record.displayName,
                     provider_name=provider_display,
                     creator_name=creator_name,
                     price_tier=price_tier,
                 )

Based on learnings: For xAI Grok models accessed via OpenRouter, the API returns null for max_completion_tokens. The convention in this codebase is to use the model's context window size as the max_output_tokens value in ModelMetadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/data/llm_registry/registry.py` around lines
143 - 151, When constructing ModelMetadata in registry.py, ensure
max_output_tokens falls back to the model context window when the DB value is
null: if record.maxOutputTokens is None use record.contextWindow as the value
for ModelMetadata.max_output_tokens (instead of passing record.maxOutputTokens
directly). Update the ModelMetadata(...) call in the block that builds metadata
(the code that references provider_name, record.contextWindow,
record.maxOutputTokens, displayName) so max_output_tokens =
record.maxOutputTokens if not None else record.contextWindow.
🧹 Nitpick comments (1)
autogpt_platform/backend/backend/data/llm_registry/registry.py (1)

52-66: Don't expose the live cache objects from the public getters.

Lines 60-61 and 37 are still mutable dicts inside a frozen wrapper, and Lines 215-217 hand out the shared schema-option list/dicts directly. A caller-side update here can silently corrupt the registry for the whole worker until the next refresh.

At minimum, return copies from get_schema_options(). Ideally, also freeze capabilities, extra_metadata, and RegistryModelCost.metadata when the cache is built.

Also applies to: 200-217

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@autogpt_platform/backend/backend/data/llm_registry/registry.py` around lines
52 - 66, The Registry is exposing live mutable structures; fix by freezing
mutable dicts when building the cache and returning copies from public getters:
when constructing RegistryModel instances in the registry build process, convert
capabilities and extra_metadata into immutable structures (e.g., mapping proxies
or other frozen representations) and freeze RegistryModelCost.metadata as well
so the cached objects aren’t mutable; update get_schema_options() (and any
public getters around lines ~200-217) to return a shallow/deep copy of the
shared schema/options list and dicts instead of returning the registry’s
internal objects directly to prevent caller-side mutation from corrupting the
shared cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@autogpt_platform/backend/backend/data/llm_registry/registry.py`:
- Around line 143-151: When constructing ModelMetadata in registry.py, ensure
max_output_tokens falls back to the model context window when the DB value is
null: if record.maxOutputTokens is None use record.contextWindow as the value
for ModelMetadata.max_output_tokens (instead of passing record.maxOutputTokens
directly). Update the ModelMetadata(...) call in the block that builds metadata
(the code that references provider_name, record.contextWindow,
record.maxOutputTokens, displayName) so max_output_tokens =
record.maxOutputTokens if not None else record.contextWindow.

---

Nitpick comments:
In `@autogpt_platform/backend/backend/data/llm_registry/registry.py`:
- Around line 52-66: The Registry is exposing live mutable structures; fix by
freezing mutable dicts when building the cache and returning copies from public
getters: when constructing RegistryModel instances in the registry build
process, convert capabilities and extra_metadata into immutable structures
(e.g., mapping proxies or other frozen representations) and freeze
RegistryModelCost.metadata as well so the cached objects aren’t mutable; update
get_schema_options() (and any public getters around lines ~200-217) to return a
shallow/deep copy of the shared schema/options list and dicts instead of
returning the registry’s internal objects directly to prevent caller-side
mutation from corrupting the shared cache.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 271dafbb-92aa-4b35-b73a-d1c751921029

📥 Commits

Reviewing files that changed from the base of the PR and between 58714c8 and 591e78b.

📒 Files selected for processing (2)
  • autogpt_platform/backend/backend/api/rest_api.py
  • autogpt_platform/backend/backend/data/llm_registry/registry.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • autogpt_platform/backend/backend/api/rest_api.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: types
  • GitHub Check: Seer Code Review
  • GitHub Check: test (3.13)
  • GitHub Check: test (3.11)
  • GitHub Check: test (3.12)
  • GitHub Check: Check PR Status
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
autogpt_platform/backend/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

autogpt_platform/backend/**/*.py: Use Python 3.11 (required; managed by Poetry via pyproject.toml) for backend development
Always run 'poetry run format' (Black + isort) before linting in backend development
Always run 'poetry run lint' (ruff) after formatting in backend development

Files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/backend/data/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

All data access in backend requires user ID checks; verify this for any 'data/*.py' changes

Files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/**/*.{py,txt}

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use poetry run prefix for all Python commands, including testing, linting, formatting, and migrations

Files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/backend/backend/**/*.py

📄 CodeRabbit inference engine (autogpt_platform/backend/CLAUDE.md)

Use Prisma ORM for database operations in PostgreSQL with pgvector for embeddings

Files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
autogpt_platform/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Format Python code with poetry run format

Files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.
📚 Learning: 2026-03-09T10:50:43.907Z
Learnt from: Bentlybro
Repo: Significant-Gravitas/AutoGPT PR: 0
File: :0-0
Timestamp: 2026-03-09T10:50:43.907Z
Learning: Repo: Significant-Gravitas/AutoGPT — File: autogpt_platform/backend/backend/blocks/llm.py
For xAI Grok models accessed via OpenRouter, the API returns `null` for `max_completion_tokens`. The convention in this codebase is to use the model's context window size as the `max_output_tokens` value in ModelMetadata. For example, Grok 3 uses 131072 (128k) and Grok 4 uses 262144 (256k). Do not flag these as incorrect max output token values.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-02-27T15:59:00.370Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — OpenAPI/codegen
Learning: Ensuring a field is required in generated TS types needs two sides: (1) no default value on the Pydantic field, and (2) the OpenAPI model's "required" array must list it. For MCPToolInfo, making input_schema required in OpenAPI and removing Field(default_factory=dict) in the backend prevents optional typing drift.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-02-27T15:59:00.370Z
Learnt from: majdyz
Repo: Significant-Gravitas/AutoGPT PR: 12213
File: autogpt_platform/frontend/src/app/api/openapi.json:9983-9995
Timestamp: 2026-02-27T15:59:00.370Z
Learning: Repo: Significant-Gravitas/AutoGPT PR: 12213 — Backend/frontend OpenAPI codegen
Learning: For MCP schema models, required OpenAPI fields must have no defaults in Pydantic. Specifically, MCPToolInfo.input_schema must be required (no Field(default_factory=dict)) so openapi.json emits it in "required", ensuring generated TS types treat input_schema as non-optional.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-03-04T23:58:18.476Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12284
File: autogpt_platform/frontend/src/app/api/openapi.json:11897-11900
Timestamp: 2026-03-04T23:58:18.476Z
Learning: Repo: Significant-Gravitas/AutoGPT — PR `#12284`
Backend/frontend OpenAPI codegen convention: In backend/api/features/store/model.py, the StoreSubmission and StoreSubmissionAdminView models define submitted_at: datetime | None, changes_summary: str | None, and instructions: str | None with no default. This is intentional to produce “required but nullable” fields in OpenAPI (properties appear in required[] and use anyOf [type, null]). This matches Prisma’s submittedAt DateTime? and changesSummary String?. Do not flag this as a required/nullable mismatch.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-02-26T17:02:22.448Z
Learnt from: Pwuts
Repo: Significant-Gravitas/AutoGPT PR: 12211
File: .pre-commit-config.yaml:160-179
Timestamp: 2026-02-26T17:02:22.448Z
Learning: Keep the pre-commit hook pattern broad for autogpt_platform/backend to ensure OpenAPI schema changes are captured. Do not narrow to backend/api/ alone, since the generated schema depends on Pydantic models across multiple directories (backend/data/, backend/blocks/, backend/copilot/, backend/integrations/, backend/util/). Narrowing could miss schema changes and cause frontend type desynchronization.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
📚 Learning: 2026-03-05T15:42:08.207Z
Learnt from: ntindle
Repo: Significant-Gravitas/AutoGPT PR: 12297
File: .claude/skills/backend-check/SKILL.md:14-16
Timestamp: 2026-03-05T15:42:08.207Z
Learning: In Python files under autogpt_platform/backend (recursively), rely on poetry run format to perform formatting (Black + isort) and linting (ruff). Do not run poetry run lint as a separate step after poetry run format, since format already includes linting checks.

Applied to files:

  • autogpt_platform/backend/backend/data/llm_registry/registry.py
🔇 Additional comments (2)
autogpt_platform/backend/backend/data/llm_registry/registry.py (2)

220-230: Deterministic default selection looks good.

Sorting before both the recommended and fallback paths removes the last dict-order dependency here.


235-240: Remove unused function or update misleading docstring.

get_all_model_slugs_for_validation() is never called anywhere in the codebase (export-only dead code). Its docstring claims it "enables migrate_llm_models to work," but migrate_llm_models directly uses the LlmModel enum values instead. Either remove this unused function or update its docstring to clarify its actual purpose.

Comment thread autogpt_platform/backend/backend/api/rest_api.py Outdated
Comment thread autogpt_platform/backend/backend/data/llm_registry/registry.py Outdated

creditCost Int // DB constraint: >= 0

credentialProvider String

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this? is it a UUID ? where is it being set ?

Comment thread autogpt_platform/backend/schema.prisma Outdated
Comment thread autogpt_platform/backend/schema.prisma Outdated
// UNIQUE (sourceModelSlug) WHERE isReverted = false
@@index([targetModelSlug])
@@index([isReverted])
@@index([sourceModelSlug, isReverted]) // Composite index for active migration queries

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the query pattern here seach by sourcemodelslug or targetmodelslug or both? I don't see it being used too

@Bentlybro Bentlybro marked this pull request as draft March 11, 2026 11:40

@majdyz majdyz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Index coverage note: The query in refresh_llm_registry is a full table scan (find_many with no where filter). This is intentional for a startup cache warm — it loads everything. The composite index (providerId, isEnabled) on LlmModel will not help a full-table read; that index is useful for future filtered queries. No issues with index coverage for the current access patterns.

Comment thread autogpt_platform/backend/backend/api/rest_api.py Outdated
Comment thread autogpt_platform/backend/backend/data/llm_registry/model.py Outdated
Comment thread autogpt_platform/backend/backend/data/llm_registry/registry.py Outdated
Comment thread autogpt_platform/backend/backend/data/llm_registry/registry.py Outdated
Comment thread autogpt_platform/backend/backend/data/llm_registry/registry.py

# Price tier defaults to 1 if not set
price_tier = record.priceTier if record.priceTier in (1, 2, 3) else 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [Medium] record.Provider guard can silently mask data integrity failures. providerId is a NOT NULL FK with onDelete: Restrict, and the Prisma query includes Provider in the include. If record.Provider is None here, it means either the FK is dangling (data corruption) or the include failed silently. Falling back to record.providerId hides the problem and produces a model with provider_name = provider_display = '<uuid>' (a raw DB ID shown in the UI). Log a warning at minimum, or raise so the corruption surfaces:

if not record.Provider:
    logger.error("LlmModel %s has no Provider despite NOT NULL FK — skipping", record.slug)
    continue


# In-memory cache (will be replaced with Redis in PR #6)
_dynamic_models: dict[str, RegistryModel] = {}
_schema_options: list[dict[str, str]] = []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [Medium] Module-level asyncio.Lock() is not safe under multi-worker deployments. Each gunicorn/uvicorn worker process has its own event loop and its own copy of _lock. Concurrent refresh_llm_registry() calls from two workers cannot coordinate — both will run the full DB query simultaneously and perform the atomic swap independently. This is acknowledged as a known limitation ("Redis cache in PR #6"), but worth flagging: until PR #6 lands, a thundering-herd scenario on startup can fire N×workers DB queries.

@Bentlybro Bentlybro force-pushed the feat/llm-registry-core branch from 32da05a to 4f286f5 Compare March 16, 2026 14:55
Bentlybro added a commit that referenced this pull request Mar 17, 2026
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Mar 17, 2026
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
@Bentlybro Bentlybro force-pushed the feat/llm-registry-core branch from 9b93a95 to fc95715 Compare March 19, 2026 11:02
Bentlybro added a commit that referenced this pull request Mar 19, 2026
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Mar 19, 2026
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Apr 3, 2026
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Apr 3, 2026
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
@github-actions github-actions Bot added the size/l label Apr 4, 2026
Bentlybro added a commit that referenced this pull request Apr 5, 2026
… unit tests

- Replace frozen dataclasses with Pydantic BaseModel(frozen=True) for true immutability
- Add typed boolean fields for model capabilities (supports_tools, etc.)
- Add comprehensive unit tests for registry module
- Addresses Majdyz review feedback on PR #12359
Bentlybro added a commit that referenced this pull request Apr 7, 2026
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Apr 7, 2026
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Apr 8, 2026
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Apr 8, 2026
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Implements the registry core for dynamic LLM model management:

**DB Layer:**
- Fetch models with provider, costs, and creator relations
- Prisma query with includes for related data
- Convert DB records to typed dataclasses

**In-memory Cache:**
- Global dict for fast model lookups
- Atomic cache refresh with lock protection
- Schema options generation for UI dropdowns

**Public API:**
- get_model(slug) - lookup by slug
- get_all_models() - all models (including disabled)
- get_enabled_models() - enabled models only
- get_schema_options() - UI dropdown data
- get_default_model_slug() - recommended or first enabled
- refresh_llm_registry() - manual refresh trigger

**Integration:**
- Refresh at API startup (before block init)
- Graceful fallback if registry unavailable
- Enables blocks to consume registry data

**Models:**
- RegistryModel - full model with metadata
- RegistryModelCost - pricing configuration
- RegistryModelCreator - model creator info
- ModelMetadata - context window, capabilities

**Next PRs:**
- PR #3: Public read API (GET endpoints)
- PR #4: Admin write API (POST/PATCH/DELETE)
- PR #5: Block integration (update LLM block)
- PR #6: Redis cache (solve thundering herd)

Lines: ~230 (registry.py ~210, __init__.py ~30, model.py from draft)
Files: 4 (3 new, 1 modified)
**CRITICAL FIX - ModelMetadata instantiation:**
- Removed non-existent 'supports_vision' argument
- Added required fields: display_name, provider_name, creator_name, price_tier
- Handle nullable DB fields (Creator, priceTier, maxOutputTokens) safely
- Fallback: creator_name='Unknown' if no Creator, price_tier=1 if invalid

**MAJOR FIX - Preserve pricing unit:**
- Added 'unit' field to RegistryModelCost dataclass
- Prevents RUN vs TOKENS ambiguity in cached costs
- Convert Prisma enum to string when building cost objects

**MAJOR FIX - Deterministic default model:**
- Sort recommended models by display_name before selection
- Prevents non-deterministic results when multiple models are recommended
- Ensures consistent default across refreshes

**STARTUP IMPROVEMENT:**
- Added comment: graceful fallback OK for now (no blocks use registry yet)
- Will be stricter in PR #5 when block integration lands
- Added success log message for registry refresh

Fixes identified by Sentry (critical TypeError) and CodeRabbit review.
- Fix ModelMetadata duplicate type collision by importing from blocks.llm
- Remove _json_to_dict helper, use dict() inline
- Add warning when Provider relation is missing (data corruption indicator)
- Optimize get_default_model_slug with next() (single sort pass)
- Optimize _build_schema_options to use list comprehension
- Move llm_registry import to top-level in rest_api.py
- Ensure max_output_tokens falls back to context_window when null

All critical and quick-win issues addressed.
Tests fail with 'relation "platform.AgentNode" does not exist' because
migrate_llm_models() runs during startup and queries a table that doesn't
exist in fresh test databases.

This is an existing bug in the codebase - the function has no error handling.

Wrap the call in try/except to gracefully handle test environments where
the AgentNode table hasn't been created yet.
… unit tests

- Replace frozen dataclasses with Pydantic BaseModel(frozen=True) for true immutability
- Add typed boolean fields for model capabilities (supports_tools, etc.)
- Add comprehensive unit tests for registry module
- Addresses Majdyz review feedback on PR #12359
…pub/sub sync

- Wrap DB fetch with @cached(shared_cache=True) so results are stored in
  Redis automatically — other workers skip the DB on warm cache
- Add notifications.py with publish/subscribe helpers using llm_registry:refresh
  pub/sub channel for cross-process invalidation
- clear_registry_cache() invalidates the shared Redis entry before a forced
  DB refresh (called by admin mutations)
- rest_api.py: start a background subscription task so every worker reloads
  its in-process cache when another worker refreshes the registry
registry_test.py (+8 tests):
- clear_registry_cache, get_model (found/not found), get_all_models,
  get_enabled_models, get_all_model_slugs_for_validation
- refresh_llm_registry error re-raise

notifications_test.py (new, 9 tests):
- publish: happy path and Redis error swallowed
- subscribe: valid message triggers on_refresh, non-message types ignored,
  wrong channel ignored, None (timeout) handled, multiple messages,
  CancelledError stops loop, connection error triggers reconnect
@github-actions github-actions Bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 13, 2026
@github-actions

Copy link
Copy Markdown
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@Bentlybro Bentlybro force-pushed the feat/llm-registry-core branch from 3919423 to ef30c1e Compare April 13, 2026 14:47
@github-actions github-actions Bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 13, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@github-actions github-actions Bot removed the size/l label Apr 13, 2026
Bentlybro added a commit that referenced this pull request Apr 13, 2026
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
Bentlybro added a commit that referenced this pull request Apr 13, 2026
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
@github-actions github-actions Bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 13, 2026
@github-actions

Copy link
Copy Markdown
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@CLAassistant

CLAassistant commented May 11, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflicts Automatically applied to PRs with merge conflicts platform/backend AutoGPT Platform - Back end size/xl

Projects

Status: 🚧 Needs work

Development

Successfully merging this pull request may close these issues.

4 participants