Summary
Add support for Gemini Omni Flash (gemini-omni-flash-preview) — Google's new multimodal video-generation model — to the @tanstack/ai-gemini adapter.
Split out from #870 (which now ships Nano Banana 2 Lite only). Omni Flash needs a genuinely new request path, not just a model-meta entry, so it's tracked separately.
Why this is not a Veo/model-meta change
Verified against the live Gemini API (2026-07-01):
- Omni Flash rejects
generateContent:
400 "This model only supports Interactions API."
- It is not a
predictLongRunning (Veo) model either — the existing video adapter's client.models.generateVideos() path does not apply.
- The model's advertised
supportedGenerationMethods is generateContent, countTokens, but in practice it only serves the Interactions API.
So Omni cannot be added to GEMINI_VIDEO_MODELS and reuse the Veo flow. It needs its own Interactions-based job path.
Verified working flow (Interactions API)
POST /v1beta/interactions
{
"model": "gemini-omni-flash-preview",
"input": "<prompt string | structured content>",
"response_modalities": ["video"],
"background": true
}
→ { "id": "v1_…", "status": "in_progress", "object": "interaction", "model": … }
GET /v1beta/interactions/{id} # poll (~24s for a 10s clip)
→ {
"status": "completed",
"usage": { "output_tokens_by_modality": [{ "modality": "video", "tokens": 57920 }], … },
"steps": [
{ "type": "user_input", "content": [{ "type": "text", "text": "…" }] },
{ "type": "thought", "signature": "…" },
{ "type": "model_output", "content": [
{ "type": "video", "mime_type": "video/mp4", "data": "<base64>" }
]}
]
}
Key differences from Veo:
- Output video is returned as inline base64 in
steps[].content[] (a model_output step), not a Veo-style file URI.
- Usage is reported as
output_tokens_by_modality (video tokens), not per-second in the response body.
SDK / dependency notes
- The installed
@google/genai@2.10.0 already exposes the Interactions API surface (client.interactions.create/get/cancel, GeminiNextGenInteractions, plus interaction.completed / video.generated webhook events).
packages/ai-gemini/package.json currently declares "@google/genai": "^2.8.0". If we build on client.interactions, bump the floor to ^2.10.0 so consumers are guaranteed to have it.
- Caveat found during verification: the SDK's typed
interactions.create() wrapper returned a bare 400 for this shape, while the raw REST call succeeded. The adapter may need to call the REST endpoint directly (or match the SDK param shape exactly). Worth reconciling during implementation.
- There is already an experimental interactions adapter to model this on:
packages/ai-gemini/src/experimental/text-interactions/adapter.ts.
Model facts (from Google docs + live API)
- Model id:
gemini-omni-flash-preview
- Inputs: text, image, video (audio references + video refs >3s / scene extension not yet supported in the API)
- Output: MP4 video with audio
- Clip length: 10 seconds (fixed today; longer "coming soon")
- Resolution: 720p
- Aspect ratios: 16:9 (default), 9:16
- Pricing: $0.10 per second of video output
Scope / tasks
Summary
Add support for Gemini Omni Flash (
gemini-omni-flash-preview) — Google's new multimodal video-generation model — to the@tanstack/ai-geminiadapter.Split out from #870 (which now ships Nano Banana 2 Lite only). Omni Flash needs a genuinely new request path, not just a
model-metaentry, so it's tracked separately.Why this is not a Veo/
model-metachangeVerified against the live Gemini API (2026-07-01):
generateContent:400 "This model only supports Interactions API."predictLongRunning(Veo) model either — the existing video adapter'sclient.models.generateVideos()path does not apply.supportedGenerationMethodsisgenerateContent, countTokens, but in practice it only serves the Interactions API.So Omni cannot be added to
GEMINI_VIDEO_MODELSand reuse the Veo flow. It needs its own Interactions-based job path.Verified working flow (Interactions API)
Key differences from Veo:
steps[].content[](amodel_outputstep), not a Veo-style file URI.output_tokens_by_modality(video tokens), not per-second in the response body.SDK / dependency notes
@google/genai@2.10.0already exposes the Interactions API surface (client.interactions.create/get/cancel,GeminiNextGenInteractions, plusinteraction.completed/video.generatedwebhook events).packages/ai-gemini/package.jsoncurrently declares"@google/genai": "^2.8.0". If we build onclient.interactions, bump the floor to^2.10.0so consumers are guaranteed to have it.interactions.create()wrapper returned a bare400for this shape, while the raw REST call succeeded. The adapter may need to call the REST endpoint directly (or match the SDK param shape exactly). Worth reconciling during implementation.packages/ai-gemini/src/experimental/text-interactions/adapter.ts.Model facts (from Google docs + live API)
gemini-omni-flash-previewScope / tasks
BaseVideoAdapter's job model, or a dedicated experimental adapter.output_tokens_by_modality.model-meta.tsentry (fixed 10s duration, 720p, 16:9/9:16, $0.10/sec) once wired to the right path.@google/genaifloor to^2.10.0if usingclient.interactions.media-generationskill updates.