Skip to content

GoogleCloudPlatform/genmedia-izumi-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

79 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
Izumi Logo

Izumi: Agentic Multimedia Ecosystem

An All-in-One Generative AI Platform for Automated Multimedia Advertisement Creation

MIT License

A Unified Monorepo for Building Generative AI Multimedia Agents


Izumi is a comprehensive, all-in-one Generative AI platform designed as a deployable solution for automated multimedia advertisement creation. It serves as a powerful Google Cloud Best Practice & Reference Architecture, showcasing the spectrum of Google's state-of-the-art generative AI models on Vertex AI.

Built for creators, marketers, and developers, this application provides a hands-on, interactive experience with cutting-edge multi-agent orchestration, serving as a blueprint for enterprise-grade AI agent deployment.

This is not an officially supported Google product. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program.

Tip

๐Ÿ’ก The Izumi Solution ๐ŸŽฌ Creative Workbench, Multi-Agent Orchestrator, and Ad Generator All-in-One! Izumi provides a complete full-stack workbench for building cinematic advertisements and interactive creative sandboxes using Google GenAI / Vertex AI. Simply provide a brief or a few assets, and Izumi handles the reasoning, orchestration, and generation end-to-end. ๐Ÿš€


๐Ÿ“‘ Table of Contents


๐Ÿ’ก Key Features

๐Ÿ”Œ Headless Agents

Independent Python FastAPI processes powered by mediagent-kit handling reasoning and orchestration.

๐ŸŽจ Studio UI

Modern React + Vite SPA providing a visual campaign canvas for creators.

โšก Fast Tooling

Leverages uv for blazing fast, deterministic environment resolution.

โ˜๏ธ Cloud Ready

Built-in Terraform IaC suites for enterprise-grade Cloud Run deployments.


๐Ÿ”ฎ The Specialized Agents

Izumi isn't a single monolithic scriptโ€”it's a distributed suite of specialized AI Agents. This repository contains four discrete AI workspaces, each tuned for a specific creative workflow:

๐ŸŽฌ Ads-X (Flagship)

  • Our most powerful enterprise orchestrator.
  • Gives brands granular control over the timing, pacing, and visual progression of an ad.
  • AI Director Mode: Autonomously devises narrative pacing while enforcing brand guidelines.
  • Template Mode: Adheres strictly to a pre-defined JSON skeleton (dictating exact clip lengths and transitions).

๐ŸŽฌ Ads Co-Director

  • A research-driven video orchestrator that treats production as an optimization problem.
  • Uses a Multi-Armed Bandit (MAB) to autonomously explore and optimize creative strategies, narrative modes, and aesthetic archetypes.
  • Features agentic self-refinement and a factored reward system to maximize campaign efficacy.

๐Ÿงฌ Elements to Video

  • A specialized narrative chain workflow built explicitly to solve the "character consistency" problem in AI video.
  • Anchors generation around persistent subjects (like a mascot or hero product) and drags them seamlessly through multiple generated clips and actions.

๐ŸŽจ Creative Toolbox

  • An unstructured, conversational sandbox.
  • Deploy the Creative Toolbox to chat naturally with the suite of Vertex AI models to generate one-off concept art, temporary voiceovers, or standalone Veo animations.


๐ŸŽฌ Video Showcase

Here are some examples of cinematic advertisements generated by Izumi.

๐Ÿ“บ Video Gallery

Click on the thumbnails to view the videos.

Case 1: Scented Candle Case 2: Luxury High Heels Case 3: Plant-Based Meat
BeggyBay_Director.mp4
GraultHeel_Director.mp4
Undefood_Director.mp4
Case 4: Facial Cleanser Case 5: Resort Sandals Case 6: Zen Garden Rake
BazQuxLLC_Director.mp4
QuxCorge_Director.mp4
AmetTools_Director.mp4
Case 7: Savory Snacks Case 8: Pet Care Case 9: Home Comfort
SED_SNACKS_Director.mp4
CONSECTETUR_Pet.mp4
LoremBaz_Home.mp4

๐Ÿ” Case Details

Expand each case to see the input assets and full prompt.

Case 1: Scented Candle

Input Assets

These images were used as input to guide the agent:

Logo Product with Logo

Full Prompt

Generate a video advertisement. Beggy Bay packs Beggy-Soy (a candle poured into a lidded metal tin), targeting Female travelers aged 25-34 in Marrakech, Morocco who are interested in travel comfort in hotel rooms.

Case 2: Luxury High Heels

Input Assets

These images were used as input to guide the agent:

Logo Product with Logo

Full Prompt

Generate a video advertisement. Grault Design elevates with Grault-Heel (high-heeled footwear intended for performance or social events), targeting Female partygoers aged 18-24 in Ibiza, Spain who are interested in partying and glamour in nightclubs.

Case 3: Plant-Based Meat

Input Assets

These images were used as input to guide the agent:

Logo Product with Logo

Full Prompt

Generate a video advertisement. Undefood Inc cooks Undefood-Ground (loose, crumbled plant protein resembling ground meat), targeting Female chefs aged 25-34 in Lima, Peru who are interested in meal prepping on Taco Tuesdays.

Case 4: Facial Cleanser

Input Assets

These images were used as input to guide the agent:

Logo Product with Logo

Full Prompt

Generate a video advertisement. BazQux LLC highlights BazQux-Wash (a facial cleansing gel dispensed from a tube container), targeting Female spa lovers aged 35-50 in Stockholm, Sweden who are interested in deep pore cleansing and detox in luxury hotel spas.

Case 5: Resort Sandals

Input Assets

These images were used as input to guide the agent:

Logo Product with Logo

Full Prompt

Generate a video advertisement. QuxCorge relaxes with QC-Sandal (open footwear consisting of a sole held to the foot by straps), targeting Female vacationers aged 25-34 in Ubud, Indonesia who are interested in poolside fashion at resorts.

Case 6: Zen Garden Rake

Input Assets

These images were used as input to guide the agent:

Logo Product with Logo

Full Prompt

Generate a video advertisement. Amet Tools rakes with Amet-Rake (toothed tool designed for gathering loose debris or smoothing soil), targeting Female mindfulness practitioners aged 25-34 in Kyoto, Japan who are interested in raking sand patterns in zen gardens.

Case 7: Savory Snacks

Input Assets

These images were used as input to guide the agent:

Image 1 Image 2 Image 3 Image 4 Image 5

Full Prompt

Campaign Name: The Perfect Shake Product/Service: Premium savory snack line (Potato Chips, Pretzels, Popcorn, and Mixed Nuts) Target Duration: 15s Format: Portrait

Strategic Context

  • Campaign Theme: Focusing on the precision and satisfaction of perfectly seasoned snacks.
  • Campaign Tone: Snappy, clean, and rhythmic. Think high-energy cuts synced to a crisp beat.
  • Primary Hook: The "Just Right" seasoning.
  • Target Audience: The Aesthetic Snacker. Young professionals and Gen Z creators who want snacks that look as good on their desks as they taste.
  • Brand Voice: Minimalist, confident, and slightly cheeky.

Visual & Narrative Style

  • Visual Style: Studio Minimalist. High-brightness, clean white backgrounds (matching your provided images) with sharp shadows and vivid product colors (like that metallic red bag).
  • Key Message: SED SNACKS: Seasoned to stand out.
  • Setting: A bright, modern home office and a high-end minimalist kitchen.
Case 8: Pet Care

Input Assets

Image 1 Image 2 Image 3 Image 4 Image 5

Full Prompt

Please help me create a 16:9 vertical video ad for CONSECTETUR CO. Use the 'Pet Companion fast pace' template.

Case 9: Home Comfort

Input Assets

Image 1 Image 2 Image 3 Image 4 Image 5

Full Prompt

Please help me create a 16:9 vertical video ad for LoremBaz Home. Use the 'Home Comfort (Fast) ' template.


๐Ÿ—๏ธ Architecture

๐Ÿ“Š System Overview

Izumi is a multi-agent video framework that enables automated multi-shot video generation while ensuring character and scene consistency.

๐Ÿง  INPUT LAYER
๐Ÿ“ Campaign Briefs โ€ข ๐Ÿ–ผ๏ธ User Assets โ€ข ๐Ÿงฉ Metadata
๐Ÿงญ CENTRAL ORCHESTRATION (FastAPI)
Parameters Extraction โ€ข Storyboard Routing โ€ข State Management
๐Ÿค– Specialized Agents
ads_x โ€ข creative_toolbox โ€ข elements_to_video
๐Ÿ› ๏ธ Tooling & Utilities
GCS โ€ข Vertex AI โ€ข Image Gen โ€ข Video Gen
๐Ÿš€ OUTPUT LAYER
๐ŸŽž๏ธ Cinematic Videos โ€ข ๐Ÿ“Š Structured Storyboards โ€ข ๐Ÿ“œ Execution Logs

๐Ÿ› ๏ธ Technology Stack

The backend leverages a standardized toolchain provided by the mediagent_kit wrapper, providing uniform access to Google's top-tier models:

Category Technology / Service
Frontend React, TypeScript, Material UI (MUI), Vite
Backend Python 3.12, FastAPI, Pydantic
Orchestration Google Vertex AI Agent Development Kit (ADK)
Reasoning & Copy Gemini: Copywriting, reasoning, and orchestration
Visual Gen Imagen & Gemini: First-frame generation and storyboard composition
Video Gen Veo: Cinematic video generation
Audio Gen Lyria: Dynamic background music composition
Voiceover Google Cloud TTS: Studio-grade voiceovers
Database Google Cloud Firestore (Local Emulator supported)

Note

Centralized Model Configuration: All AI models used by the services are centrally configured in mediagent_config.json located in the project root. This file allows you to map specific models to specific tasks (e.g., default, enrichment, repair) and lists compatible models for reference.


๐Ÿš€ Quick Start

๐Ÿ–ฅ๏ธ 1. Mandatory Prerequisites

You must install uv to manage dependencies and run the workspace.

# 1. Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Add it to your PATH immediately for this session
export PATH="$HOME/.local/bin:$PATH"

๐Ÿ–ฅ๏ธ 2. Setup Development Workspace

  1. Authentication Configure your local environment with ambient cloud application credentials:

    gcloud auth login
    gcloud auth application-default login
  2. Sync Workspace Dependencies Instantly resolve and install all project packages:

    uv sync
  3. Setup Environment Manifests Provision standard .env.local bindings for GCP:

    uv run scripts/setup_gcp_project.py --app_env local

๐ŸŽฏ Launching the Workspace

๐Ÿ”ด Path A: Local Emulator (Recommended)

This starts a local mock database. It isolates your work and prevents conflicts with shared cloud data.

./scripts/start-all.sh --with-db-emulator

โ˜๏ธ Path B: Live Google Cloud

Connects your local server directly to your live Google Cloud Firestore instance.

./scripts/start-all.sh

๐Ÿ“ Environment Configuration Example

To connect your local environment to your GCP resources, you can manually verify or update the generated .env.local file in demos/backend/. Here is an example of what it should contain:

# Common environment variables
FRONTEND_URL="http://localhost:5173"
ENVIRONMENT="local"
LOG_LEVEL="INFO"

# Google Cloud Project configuration
GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
PROJECT_ID="your-gcp-project-id"

# Firestore configuration (leave blank for default)
FIRESTORE_DATABASE_ID=""

# GCS Buckets
GCS_BUCKET_NAME="your-sandbox-bucket-name"

For the frontend in demos/frontend/.env:

VITE_API_BASE_URL=http://localhost:8000

๐Ÿ’ก Troubleshooting

mTLS / Certificate Errors

If you encounter an error like a bytes-like object is required, not 'NoneType' originating from mtls.py, or other mTLS connection failures:

  1. Open demos/backend/.env.local.
  2. Add the following lines to bypass the Mutual TLS check:
    GOOGLE_API_USE_MTLS=never
    GOOGLE_API_USE_CLIENT_CERTIFICATE=false
  3. Restart the application.

๐Ÿงฌ Code Styling & Guidelines

To maintain code quality and consistency:

  • Python (Backend): We adhere to high standards of formatting and linting.
    • We use black for code formatting.
    • We use pylint for static code analysis.
    • The repository maintains an 80% coverage threshold for core modules.
  • Frontend (TypeScript): Follows standard React and TypeScript guidelines, using Vite for development.

Running Checks Manually

To check for linting issues or format code:

# Run pytest with coverage
uv run pytest tests/demos/ --cov=demos/backend --cov-report=term-missing

# Run pre-commit hooks (runs black, pylint, etc.)
uv run pre-commit run --all-files

๐Ÿค Contributing

We welcome contributions to Izumi! Whether it's new templates, features, bug fixes, or documentation improvements, your help is valued.

Branching Model

Please create feature branches and submit pull requests. Ensure your commits follow standard conventions and pass all pre-commit hooks.


โš–๏ธ Responsible Use & Disclaimer

Building and deploying generative AI agents requires a commitment to responsible development practices. Izumi provides you the tools to build agents, but you must also provide the commitment to ethical and fair use of these agents. We encourage you to:

  • Start with a Risk Assessment: Before deploying your agent, identify potential risks related to bias, privacy, safety, and accuracy.
  • Document Your Process: Maintain detailed records of your development process, including data sources, models, configurations, and mitigation strategies.

โš–๏ธ Disclaimer

Important

This is not an officially supported Google product. This repository is provided "as is" without any warranties, express or implied. Users assume all responsibility for the deployment, costs, and content generated by the AI agents. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program.

Copyright 2026 Google LLC. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.


โ˜๏ธ Cloud Deployments

To deploy these headless agents up into serverless Google Cloud Run profiles under high compliance environments (IAP protection, Terraform auditing), refer to the Deployment Guide.


โ˜„๏ธ More to Come

Stay tuned for exciting future updates! We are planning a deep integration with GCC Creative Studio to merge the power of multi-agent video orchestration with Creative Studio's comprehensive GenAI platform.

๐ŸŽ‰ Meet us at Cloud Next 2026! Our team will be attending Cloud Next 2026 and hosting a booth alongside Creative Studio. Feel free to come by to learn more! Connect with us on LinkedIn for updates.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors