Skip to content

feat(ai-gemini): Gemini Omni Flash video generation via the Interactions API#886

Draft
tombeckenham wants to merge 4 commits into
mainfrom
871-support-gemini-omni-flash-gemini-omni-flash-preview-via-the-interactions-api
Draft

feat(ai-gemini): Gemini Omni Flash video generation via the Interactions API#886
tombeckenham wants to merge 4 commits into
mainfrom
871-support-gemini-omni-flash-gemini-omni-flash-preview-via-the-interactions-api

Conversation

@tombeckenham

@tombeckenham tombeckenham commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Closes #871

Summary

Adds Gemini Omni Flash (gemini-omni-flash-preview) — Google's multimodal video-generation model with conversational editing — to @tanstack/ai-gemini. Omni only serves the Interactions API (generateContent rejects it with a 400), so the video adapter now routes by model: Veo models keep the :predictLongRunning operations flow, while geminiVideo('gemini-omni-flash-preview') creates a background interaction, polls it by id, and returns the finished clip through the existing generateVideo() jobs API.

What changed

  • Adapter (packages/ai-gemini): Interactions-based job path — interactions.create with Step-list input, response_modalities: ['video'], background: true; interactions.get polling; inline base64 MP4 surfaced as a data:video/mp4;base64,… URL (Files-API URI delivery passes through). Usage maps from output_tokens_by_modality. size maps onto response_format.aspect_ratio ('16:9' | '9:16') and duration onto response_format.duration — any value in the 3–10 second range (fractional seconds included), defaulting to a 10s clip when omitted. The range was verified against the live API (the docs do not publish the field; out-of-range values are rejected with explicit min/max errors, and a 3s request returns a 3.008s MP4 per ffprobe). Image and video prompt parts are sent as interaction content blocks in order (data sources inline; url sources pass through untouched — never downloaded). modelOptions.previous_interaction_id chains conversational video edits.
  • Model meta / options: GEMINI_OMNI_FLASH_PREVIEW ($0.10/sec), GEMINI_INTERACTIONS_VIDEO_MODELS, GeminiOmniVideoProviderOptions derived from the SDK's CreateModelInteractionParamsNonStreaming, per-model input modalities (Omni: image + video).
  • Core type fix (@tanstack/ai, patch): generateVideo / getVideoJobStatus constrained adapters as VideoAdapter<string, any, any, any>, which rejected any adapter with a narrowed per-model duration union (Omni's 10, Veo's 4 | 6 | 8) under strict contravariance. Constraints now span all six generics.
  • Dependency: @google/genai floor ^2.8.0^2.10.0 (Interactions API surface).
  • Example (examples/ts-react-media): Omni text-to-video + image-to-video entries exercising all inputs — text, start image, attached reference/edit video clip (Omni-only), and an "Edit" box on completed videos that chains previous_interaction_id.
  • Docs/skill: docs/media/video-generation.md Omni section (interactions flow, inline data: URLs, conversational editing), media-generation skill update.

Testing

  • Live-verified against the real Gemini API: generation (~45s, valid 2.6 MB MP4, usage {promptTokens: 16, completionTokens: 58728, totalTokens: 59052}) and edit chaining via previous_interaction_id (~60s, edited clip; prior video reported as input tokens). The issue's SDK-vs-REST 400 caveat is resolved — the typed interactions.create() works with Step-list input, no raw REST fallback needed.
  • 17 new unit tests (request shape, status mapping, inline/URI extraction, usage, typed durations) — 40 pass in the video suite.
  • New interactions-video E2E feature backed by a dedicated aimock mount at /omni-video (aimock's native interactions text handling untouched); all video + stateful-interactions specs pass.
  • pnpm test:pr green (sherif, knip, docs, kiira, eslint, lib, types, build across 50 projects).

🤖 Generated with Claude Code

tombeckenham and others added 2 commits July 2, 2026 20:35
…nteractions API

Add gemini-omni-flash-preview to the Gemini video adapter. Omni only
serves the Interactions API (generateContent rejects it with 400), so
the adapter now routes by model: Veo models keep the
:predictLongRunning operations flow, while Omni creates a background
interaction with response_modalities: ['video'], polls it by id, and
returns the inline base64 MP4 as a data: URL (Files-API URI delivery
passes through). Usage maps from output_tokens_by_modality, size maps
onto response_format.aspect_ratio, and
modelOptions.previous_interaction_id chains conversational video edits.

- model-meta: GEMINI_OMNI_FLASH_PREVIEW ($0.10/sec video+audio output)
  + GEMINI_INTERACTIONS_VIDEO_MODELS
- provider options: GeminiOmniVideoProviderOptions derived from the
  SDK's CreateModelInteractionParamsNonStreaming; per-model input
  modalities (Omni accepts image+video parts) and fixed 10s duration
- @google/genai floor bumped to ^2.10.0 for the interactions surface
- 17 new unit tests; new interactions-video E2E feature backed by a
  dedicated aimock mount (native interactions text handling untouched)
- docs/media/video-generation.md + media-generation skill updates

Verified live against the Gemini API: background job completed in ~45s
and returned a valid MP4 with video-modality usage; the SDK's typed
interactions.create works with Step-list input, so no raw REST
fallback is needed.

Closes #871

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add gemini-omni-flash-preview (text-to-video + image-to-video) to the
ts-react-media example, exercising every Omni input: text prompts, a
start image, an attached reference/edit video clip (Omni-only — never
sent to other providers), and conversational editing that chains a new
prompt onto a completed generation via previous_interaction_id.

Also fixes a latent core type bug this surfaced: generateVideo /
getVideoJobStatus constrained adapters as VideoAdapter<string, any,
any, any>, leaving the duration generic at its Record<string, number>
default — any adapter with a narrowed per-model duration union (Omni's
10, Veo's 4|6|8) failed assignability under strict function-type
contravariance. All video-activity constraints now span all six
VideoAdapter generics.

Verified live: Omni edit chaining (previous_interaction_id) against the
real Gemini API returned an edited 10s MP4; example dev server boots
and type-checks.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 359ab1ee-324f-4a8c-98ef-7f63847272ea

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 871-support-gemini-omni-flash-gemini-omni-flash-preview-via-the-interactions-api

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🚀 Changeset Version Preview

4 package(s) bumped directly, 20 bumped as dependents.

🟨 Minor bumps

Package Version Reason
@tanstack/ai-anthropic 0.15.13 → 0.16.0 Changeset
@tanstack/ai-gemini 0.19.0 → 0.20.0 Changeset

🟩 Patch bumps

Package Version Reason
@tanstack/ai 0.39.0 → 0.39.1 Changeset
@tanstack/ai-ollama 0.8.12 → 0.8.13 Changeset
@tanstack/ai-angular 0.2.1 → 0.2.2 Dependent
@tanstack/ai-bedrock 0.1.0 → 0.1.1 Dependent
@tanstack/ai-client 0.19.1 → 0.19.2 Dependent
@tanstack/ai-code-mode 0.3.4 → 0.3.5 Dependent
@tanstack/ai-code-mode-skills 0.3.7 → 0.3.8 Dependent
@tanstack/ai-devtools-core 0.4.20 → 0.4.21 Dependent
@tanstack/ai-fal 0.9.8 → 0.9.9 Dependent
@tanstack/ai-isolate-cloudflare 0.2.34 → 0.2.35 Dependent
@tanstack/ai-isolate-node 0.1.43 → 0.1.44 Dependent
@tanstack/ai-isolate-quickjs 0.1.43 → 0.1.44 Dependent
@tanstack/ai-mcp 0.2.1 → 0.2.2 Dependent
@tanstack/ai-preact 0.10.1 → 0.10.2 Dependent
@tanstack/ai-react 0.16.2 → 0.16.3 Dependent
@tanstack/ai-solid 0.14.1 → 0.14.2 Dependent
@tanstack/ai-svelte 0.14.1 → 0.14.2 Dependent
@tanstack/ai-vue 0.14.1 → 0.14.2 Dependent
@tanstack/ai-vue-ui 0.2.29 → 0.2.30 Dependent
@tanstack/preact-ai-devtools 0.1.63 → 0.1.64 Dependent
@tanstack/react-ai-devtools 0.2.63 → 0.2.64 Dependent
@tanstack/solid-ai-devtools 0.2.63 → 0.2.64 Dependent

@nx-cloud

nx-cloud Bot commented Jul 2, 2026

Copy link
Copy Markdown

View your CI Pipeline Execution ↗ for commit b427cc4

Command Status Duration Result
nx run-many --targets=build --exclude=examples/... ✅ Succeeded 10s View ↗

☁️ Nx Cloud last updated this comment at 2026-07-03 02:21:00 UTC

@pkg-pr-new

pkg-pr-new Bot commented Jul 2, 2026

Copy link
Copy Markdown

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/@tanstack/ai@886

@tanstack/ai-acp

npm i https://pkg.pr.new/@tanstack/ai-acp@886

@tanstack/ai-angular

npm i https://pkg.pr.new/@tanstack/ai-angular@886

@tanstack/ai-anthropic

npm i https://pkg.pr.new/@tanstack/ai-anthropic@886

@tanstack/ai-bedrock

npm i https://pkg.pr.new/@tanstack/ai-bedrock@886

@tanstack/ai-claude-code

npm i https://pkg.pr.new/@tanstack/ai-claude-code@886

@tanstack/ai-client

npm i https://pkg.pr.new/@tanstack/ai-client@886

@tanstack/ai-code-mode

npm i https://pkg.pr.new/@tanstack/ai-code-mode@886

@tanstack/ai-code-mode-skills

npm i https://pkg.pr.new/@tanstack/ai-code-mode-skills@886

@tanstack/ai-codex

npm i https://pkg.pr.new/@tanstack/ai-codex@886

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/@tanstack/ai-devtools-core@886

@tanstack/ai-elevenlabs

npm i https://pkg.pr.new/@tanstack/ai-elevenlabs@886

@tanstack/ai-event-client

npm i https://pkg.pr.new/@tanstack/ai-event-client@886

@tanstack/ai-fal

npm i https://pkg.pr.new/@tanstack/ai-fal@886

@tanstack/ai-gemini

npm i https://pkg.pr.new/@tanstack/ai-gemini@886

@tanstack/ai-grok

npm i https://pkg.pr.new/@tanstack/ai-grok@886

@tanstack/ai-grok-build

npm i https://pkg.pr.new/@tanstack/ai-grok-build@886

@tanstack/ai-groq

npm i https://pkg.pr.new/@tanstack/ai-groq@886

@tanstack/ai-isolate-cloudflare

npm i https://pkg.pr.new/@tanstack/ai-isolate-cloudflare@886

@tanstack/ai-isolate-node

npm i https://pkg.pr.new/@tanstack/ai-isolate-node@886

@tanstack/ai-isolate-quickjs

npm i https://pkg.pr.new/@tanstack/ai-isolate-quickjs@886

@tanstack/ai-mcp

npm i https://pkg.pr.new/@tanstack/ai-mcp@886

@tanstack/ai-mistral

npm i https://pkg.pr.new/@tanstack/ai-mistral@886

@tanstack/ai-ollama

npm i https://pkg.pr.new/@tanstack/ai-ollama@886

@tanstack/ai-openai

npm i https://pkg.pr.new/@tanstack/ai-openai@886

@tanstack/ai-opencode

npm i https://pkg.pr.new/@tanstack/ai-opencode@886

@tanstack/ai-openrouter

npm i https://pkg.pr.new/@tanstack/ai-openrouter@886

@tanstack/ai-preact

npm i https://pkg.pr.new/@tanstack/ai-preact@886

@tanstack/ai-react

npm i https://pkg.pr.new/@tanstack/ai-react@886

@tanstack/ai-react-ui

npm i https://pkg.pr.new/@tanstack/ai-react-ui@886

@tanstack/ai-sandbox

npm i https://pkg.pr.new/@tanstack/ai-sandbox@886

@tanstack/ai-sandbox-cloudflare

npm i https://pkg.pr.new/@tanstack/ai-sandbox-cloudflare@886

@tanstack/ai-sandbox-daytona

npm i https://pkg.pr.new/@tanstack/ai-sandbox-daytona@886

@tanstack/ai-sandbox-docker

npm i https://pkg.pr.new/@tanstack/ai-sandbox-docker@886

@tanstack/ai-sandbox-local-process

npm i https://pkg.pr.new/@tanstack/ai-sandbox-local-process@886

@tanstack/ai-sandbox-sprites

npm i https://pkg.pr.new/@tanstack/ai-sandbox-sprites@886

@tanstack/ai-sandbox-vercel

npm i https://pkg.pr.new/@tanstack/ai-sandbox-vercel@886

@tanstack/ai-solid

npm i https://pkg.pr.new/@tanstack/ai-solid@886

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/@tanstack/ai-solid-ui@886

@tanstack/ai-svelte

npm i https://pkg.pr.new/@tanstack/ai-svelte@886

@tanstack/ai-utils

npm i https://pkg.pr.new/@tanstack/ai-utils@886

@tanstack/ai-vue

npm i https://pkg.pr.new/@tanstack/ai-vue@886

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/@tanstack/ai-vue-ui@886

@tanstack/openai-base

npm i https://pkg.pr.new/@tanstack/openai-base@886

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/@tanstack/preact-ai-devtools@886

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/@tanstack/react-ai-devtools@886

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/@tanstack/solid-ai-devtools@886

commit: b427cc4

tombeckenham and others added 2 commits July 2, 2026 21:03
The issue's live verification concluded Omni clips were a fixed 10
seconds, but response_format.duration is a real request field — just
undocumented. Verified against the live API: it takes a "<seconds>s"
string, accepts any value in the 3-10s range including fractional
seconds (a 3s request returns a 3.008s MP4 per ffprobe), rejects
out-of-range values with explicit minimum/maximum errors, and defaults
to 10s when omitted.

Omni's duration is now typed number with availableDurations() =
{ kind: 'range', min: 3, max: 10, unit: 'seconds' } and snapDuration
clamping into it; the adapter maps the generateVideo duration option
onto response_format.duration.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Reject out-of-range Omni durations at job creation with a clear local
  error instead of silently passing them to the live API
- Map requires_action interactions to a failed status so polling can't
  spin until timeout (reachable via previous_interaction_id chaining)
- Surface failed job statuses in the ts-react-media example instead of
  polling forever on a pending spinner
- Add a compile-time regression test guarding the generateVideo
  VideoAdapter generic-arity fix, plus unit tests for duration
  rejection, fractional pass-through, and requires_action mapping
- Fix stale doc/comment claims: Veo 2/3 model lists, "fixed 10s" clips,
  "clamped" duration wording, and content-block ordering (images, then
  videos, then text)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Gemini Omni Flash (gemini-omni-flash-preview) via the Interactions API

1 participant