An All-in-One Generative AI Platform for Automated Multimedia Advertisement Creation
A Unified Monorepo for Building Generative AI Multimedia Agents
Izumi is a comprehensive, all-in-one Generative AI platform designed as a deployable solution for automated multimedia advertisement creation. It serves as a powerful Google Cloud Best Practice & Reference Architecture, showcasing the spectrum of Google's state-of-the-art generative AI models on Vertex AI.
Built for creators, marketers, and developers, this application provides a hands-on, interactive experience with cutting-edge multi-agent orchestration, serving as a blueprint for enterprise-grade AI agent deployment.
This is not an officially supported Google product. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program.
Tip
๐ก The Izumi Solution ๐ฌ Creative Workbench, Multi-Agent Orchestrator, and Ad Generator All-in-One! Izumi provides a complete full-stack workbench for building cinematic advertisements and interactive creative sandboxes using Google GenAI / Vertex AI. Simply provide a brief or a few assets, and Izumi handles the reasoning, orchestration, and generation end-to-end. ๐
- ๐ก Key Features
- ๐ฎ The Specialized Agents
- ๐๏ธ Architecture
- ๐ ๏ธ Technology Stack
- ๐ Quick Start
- ๐งฌ Code Styling & Guidelines
- ๐ค Contributing
- โ๏ธ Responsible Use & Disclaimer
|
Independent Python FastAPI processes powered by |
Modern React + Vite SPA providing a visual campaign canvas for creators. |
Leverages |
Built-in Terraform IaC suites for enterprise-grade Cloud Run deployments. |
Izumi isn't a single monolithic scriptโit's a distributed suite of specialized AI Agents. This repository contains four discrete AI workspaces, each tuned for a specific creative workflow:
๐ฌ Ads-X (Flagship)
- Our most powerful enterprise orchestrator.
- Gives brands granular control over the timing, pacing, and visual progression of an ad.
- AI Director Mode: Autonomously devises narrative pacing while enforcing brand guidelines.
- Template Mode: Adheres strictly to a pre-defined JSON skeleton (dictating exact clip lengths and transitions).
๐ฌ Ads Co-Director
- A research-driven video orchestrator that treats production as an optimization problem.
- Uses a Multi-Armed Bandit (MAB) to autonomously explore and optimize creative strategies, narrative modes, and aesthetic archetypes.
- Features agentic self-refinement and a factored reward system to maximize campaign efficacy.
๐งฌ Elements to Video
- A specialized narrative chain workflow built explicitly to solve the "character consistency" problem in AI video.
- Anchors generation around persistent subjects (like a mascot or hero product) and drags them seamlessly through multiple generated clips and actions.
๐จ Creative Toolbox
- An unstructured, conversational sandbox.
- Deploy the Creative Toolbox to chat naturally with the suite of Vertex AI models to generate one-off concept art, temporary voiceovers, or standalone Veo animations.
Here are some examples of cinematic advertisements generated by Izumi.
Click on the thumbnails to view the videos.
| Case 1: Scented Candle | Case 2: Luxury High Heels | Case 3: Plant-Based Meat |
|---|---|---|
BeggyBay_Director.mp4 |
GraultHeel_Director.mp4 |
Undefood_Director.mp4 |
| Case 4: Facial Cleanser | Case 5: Resort Sandals | Case 6: Zen Garden Rake |
BazQuxLLC_Director.mp4 |
QuxCorge_Director.mp4 |
AmetTools_Director.mp4 |
| Case 7: Savory Snacks | Case 8: Pet Care | Case 9: Home Comfort |
SED_SNACKS_Director.mp4 |
CONSECTETUR_Pet.mp4 |
LoremBaz_Home.mp4 |
Expand each case to see the input assets and full prompt.
Case 1: Scented Candle
These images were used as input to guide the agent:
Generate a video advertisement. Beggy Bay packs Beggy-Soy (a candle poured into a lidded metal tin), targeting Female travelers aged 25-34 in Marrakech, Morocco who are interested in travel comfort in hotel rooms.
Case 2: Luxury High Heels
These images were used as input to guide the agent:
Generate a video advertisement. Grault Design elevates with Grault-Heel (high-heeled footwear intended for performance or social events), targeting Female partygoers aged 18-24 in Ibiza, Spain who are interested in partying and glamour in nightclubs.
Case 3: Plant-Based Meat
These images were used as input to guide the agent:
Generate a video advertisement. Undefood Inc cooks Undefood-Ground (loose, crumbled plant protein resembling ground meat), targeting Female chefs aged 25-34 in Lima, Peru who are interested in meal prepping on Taco Tuesdays.
Case 4: Facial Cleanser
These images were used as input to guide the agent:
Generate a video advertisement. BazQux LLC highlights BazQux-Wash (a facial cleansing gel dispensed from a tube container), targeting Female spa lovers aged 35-50 in Stockholm, Sweden who are interested in deep pore cleansing and detox in luxury hotel spas.
Case 5: Resort Sandals
These images were used as input to guide the agent:
Generate a video advertisement. QuxCorge relaxes with QC-Sandal (open footwear consisting of a sole held to the foot by straps), targeting Female vacationers aged 25-34 in Ubud, Indonesia who are interested in poolside fashion at resorts.
Case 6: Zen Garden Rake
These images were used as input to guide the agent:
Generate a video advertisement. Amet Tools rakes with Amet-Rake (toothed tool designed for gathering loose debris or smoothing soil), targeting Female mindfulness practitioners aged 25-34 in Kyoto, Japan who are interested in raking sand patterns in zen gardens.
Case 7: Savory Snacks
These images were used as input to guide the agent:
Campaign Name: The Perfect Shake Product/Service: Premium savory snack line (Potato Chips, Pretzels, Popcorn, and Mixed Nuts) Target Duration: 15s Format: Portrait
Strategic Context
- Campaign Theme: Focusing on the precision and satisfaction of perfectly seasoned snacks.
- Campaign Tone: Snappy, clean, and rhythmic. Think high-energy cuts synced to a crisp beat.
- Primary Hook: The "Just Right" seasoning.
- Target Audience: The Aesthetic Snacker. Young professionals and Gen Z creators who want snacks that look as good on their desks as they taste.
- Brand Voice: Minimalist, confident, and slightly cheeky.
Visual & Narrative Style
- Visual Style: Studio Minimalist. High-brightness, clean white backgrounds (matching your provided images) with sharp shadows and vivid product colors (like that metallic red bag).
- Key Message: SED SNACKS: Seasoned to stand out.
- Setting: A bright, modern home office and a high-end minimalist kitchen.
Case 8: Pet Care
Please help me create a 16:9 vertical video ad for CONSECTETUR CO. Use the 'Pet Companion fast pace' template.
Case 9: Home Comfort
Please help me create a 16:9 vertical video ad for LoremBaz Home. Use the 'Home Comfort (Fast) ' template.
Izumi is a multi-agent video framework that enables automated multi-shot video generation while ensuring character and scene consistency.
|
๐ง INPUT LAYER ๐ Campaign Briefs โข ๐ผ๏ธ User Assets โข ๐งฉ Metadata |
||
|
๐งญ CENTRAL ORCHESTRATION (FastAPI) Parameters Extraction โข Storyboard Routing โข State Management |
||
|
๐ค Specialized Agents ads_x โข creative_toolbox โข elements_to_video |
๐ ๏ธ Tooling & Utilities GCS โข Vertex AI โข Image Gen โข Video Gen |
|
|
๐ OUTPUT LAYER ๐๏ธ Cinematic Videos โข ๐ Structured Storyboards โข ๐ Execution Logs |
||
The backend leverages a standardized toolchain provided by the mediagent_kit wrapper, providing uniform access to Google's top-tier models:
| Category | Technology / Service |
|---|---|
| Frontend | React, TypeScript, Material UI (MUI), Vite |
| Backend | Python 3.12, FastAPI, Pydantic |
| Orchestration | Google Vertex AI Agent Development Kit (ADK) |
| Reasoning & Copy | Gemini: Copywriting, reasoning, and orchestration |
| Visual Gen | Imagen & Gemini: First-frame generation and storyboard composition |
| Video Gen | Veo: Cinematic video generation |
| Audio Gen | Lyria: Dynamic background music composition |
| Voiceover | Google Cloud TTS: Studio-grade voiceovers |
| Database | Google Cloud Firestore (Local Emulator supported) |
Note
Centralized Model Configuration: All AI models used by the services are centrally configured in mediagent_config.json located in the project root. This file allows you to map specific models to specific tasks (e.g., default, enrichment, repair) and lists compatible models for reference.
You must install uv to manage dependencies and run the workspace.
# 1. Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# 2. Add it to your PATH immediately for this session
export PATH="$HOME/.local/bin:$PATH"-
Authentication Configure your local environment with ambient cloud application credentials:
gcloud auth login gcloud auth application-default login
-
Sync Workspace Dependencies Instantly resolve and install all project packages:
uv sync
-
Setup Environment Manifests Provision standard
.env.localbindings for GCP:uv run scripts/setup_gcp_project.py --app_env local
This starts a local mock database. It isolates your work and prevents conflicts with shared cloud data.
./scripts/start-all.sh --with-db-emulator- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Docs (Swagger): http://localhost:8000/docs
Connects your local server directly to your live Google Cloud Firestore instance.
./scripts/start-all.shTo connect your local environment to your GCP resources, you can manually verify or update the generated .env.local file in demos/backend/. Here is an example of what it should contain:
# Common environment variables
FRONTEND_URL="http://localhost:5173"
ENVIRONMENT="local"
LOG_LEVEL="INFO"
# Google Cloud Project configuration
GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
PROJECT_ID="your-gcp-project-id"
# Firestore configuration (leave blank for default)
FIRESTORE_DATABASE_ID=""
# GCS Buckets
GCS_BUCKET_NAME="your-sandbox-bucket-name"For the frontend in demos/frontend/.env:
VITE_API_BASE_URL=http://localhost:8000If you encounter an error like a bytes-like object is required, not 'NoneType' originating from mtls.py, or other mTLS connection failures:
- Open
demos/backend/.env.local. - Add the following lines to bypass the Mutual TLS check:
GOOGLE_API_USE_MTLS=never GOOGLE_API_USE_CLIENT_CERTIFICATE=false
- Restart the application.
To maintain code quality and consistency:
- Python (Backend): We adhere to high standards of formatting and linting.
- We use
blackfor code formatting. - We use
pylintfor static code analysis. - The repository maintains an 80% coverage threshold for core modules.
- We use
- Frontend (TypeScript): Follows standard React and TypeScript guidelines, using Vite for development.
To check for linting issues or format code:
# Run pytest with coverage
uv run pytest tests/demos/ --cov=demos/backend --cov-report=term-missing
# Run pre-commit hooks (runs black, pylint, etc.)
uv run pre-commit run --all-filesWe welcome contributions to Izumi! Whether it's new templates, features, bug fixes, or documentation improvements, your help is valued.
Please create feature branches and submit pull requests. Ensure your commits follow standard conventions and pass all pre-commit hooks.
Building and deploying generative AI agents requires a commitment to responsible development practices. Izumi provides you the tools to build agents, but you must also provide the commitment to ethical and fair use of these agents. We encourage you to:
- Start with a Risk Assessment: Before deploying your agent, identify potential risks related to bias, privacy, safety, and accuracy.
- Document Your Process: Maintain detailed records of your development process, including data sources, models, configurations, and mitigation strategies.
Important
This is not an officially supported Google product. This repository is provided "as is" without any warranties, express or implied. Users assume all responsibility for the deployment, costs, and content generated by the AI agents. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program.
Copyright 2026 Google LLC. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.
To deploy these headless agents up into serverless Google Cloud Run profiles under high compliance environments (IAP protection, Terraform auditing), refer to the Deployment Guide.
Stay tuned for exciting future updates! We are planning a deep integration with GCC Creative Studio to merge the power of multi-agent video orchestration with Creative Studio's comprehensive GenAI platform.
๐ Meet us at Cloud Next 2026! Our team will be attending Cloud Next 2026 and hosting a booth alongside Creative Studio. Feel free to come by to learn more! Connect with us on LinkedIn for updates.











