Write the scene
Describe the subject, action, setting, and mood. Add direction for camera, lighting, or pace if you want — Gemini Omni Flash respects detailed prompts and ignores empty filler.
● Live generator
Type a prompt and Gemini Omni Flash returns a cinematic clip in seconds. No setup, no waitlist.
Describe a scene in plain English. Gemini Omni Flash turns it into a cinematic clip with real-world physics, consistent characters, and the look of a researched shot — not a hallucination.
No Google subscription, no waitlist. Sign up with email.
Multimodal input · Conversational refinement · Real-world physics

● How it works
Gomni sends your prompt to Gemini Omni Flash and streams the result back. Four steps from idea to clip.
Describe the subject, action, setting, and mood. Add direction for camera, lighting, or pace if you want — Gemini Omni Flash respects detailed prompts and ignores empty filler.
Hit generate. The model produces a cinematic clip grounded in Gemini's real-world knowledge — physical forces look right, recognizable places look familiar.
Don't like the lighting? Reply "warmer light, slower camera" — the model preserves the rest of the scene and applies just the edit. Iterate without losing context.
Download as MP4 in landscape, portrait, square, or ultrawide. Watermark-free, commercial rights included. SynthID provenance is embedded invisibly.
● Benefits
Google's Gemini Omni Flash is the strongest text-to-video model launched to date. Gomni gives you the cleanest path to using it.
What you get when you generate text-to-video clips through Gomni.
Gemini Omni Flash reads complex prompts the way a director reads a brief. Subject, setting, camera language, lighting, mood — all parsed and reflected in the output.
Output in 16:9 (landscape), 9:16 (vertical for Reels / Shorts / TikTok), 1:1 (square), 21:9 (ultrawide), or custom. One generation, one platform-ready file.
Flash-tier clips run up to 10 seconds at launch — enough for ads, social posts, b-roll, and product shots. Stitch generations together for longer narratives.
Photorealistic, cinematic, documentary, anime, watercolor, claymation — described in the prompt and respected by the model.
Generate the same character across multiple clips. Identity holds — useful for series content, characters, and brand mascots.
Refine generations by replying in natural language. The scene state is preserved across turns — no re-prompting from scratch.
● FAQ
Common questions about generating video from text on Gomni.
Sign up with email, get starter credits, and try Gemini Omni Flash in under a minute.