Skip to content

Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.

License

Notifications You must be signed in to change notification settings

nvidia-cosmos/cosmos-transfer1

Repository files navigation

NVIDIA Cosmos Header

NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains

  1. Pre-trained models (available via Hugging Face) under the NVIDIA Open Model License that allows commercial use of the models for free.
  2. Training scripts under the Apache 2 License for post-training the models for various downstream Physical AI applications.

Key Features

Cosmos-Transfer1 is a pre-trained, diffusion-based conditional world model designed for multimodal, controllable world generation. It creates world simulations based on multiple spatial control inputs across various modalities, such as segmentation, depth, and edge maps. Cosmos-Transfer1 offers the flexibility to weight different conditional inputs differently at varying spatial locations and temporal instances, enabling highly customizable world generation. This capability is particularly useful for various world-to-world transfer applications, including Sim2Real.

The model is available via Hugging Face. The post-training scripts will be released soon!

Examples

Here is an example of Transfer1 with a code snippet highlighting the inference usage.

export CUDA_VISIBLE_DEVICES=0
export CHECKPOINT_DIR="${CHECKPOINT_DIR:=./checkpoints}"
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir $CHECKPOINT_DIR \
    --video_save_folder outputs/robot_sample \
    --controlnet_specs assets/robot_sample_spec.json \
    --offload_text_encoder_model

You can also run cosmos-transfer1 on multiple GPUs as follows:

export CUDA_VISIBLE_DEVICES="${CUDA_VISIBLE_DEVICES:=0,1,2,3}"
export CHECKPOINT_DIR="${CHECKPOINT_DIR:=./checkpoints}"
export NUM_GPU="${NUM_GPU:=4}"
CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) torchrun --nproc_per_node=$NUM_GPU --nnodes=1 --node_rank=0 cosmos_transfer1/diffusion/inference/transfer.py \
    --checkpoint_dir $CHECKPOINT_DIR \
    --video_save_folder outputs/example2_uniform_weights \
    --controlnet_specs assets/inference_cosmos_transfer1_uniform_weights.json \
    --offload_text_encoder_model \
    --num_gpus $NUM_GPU

robot_sample_input.mp4

robot_sample_output.mp4

Model Family

Model name Description Try it out Supported Hardware
Cosmos-Transfer1-7B World Generation with Adaptive Multimodal Control Inference 80GB H100
Cosmos-Transfer1-7B-Sample-AV Cosmos-Transfer1 for autonomous vehicle tasks Inference 80GB H100

License and Contact

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.

NVIDIA Cosmos source code is released under the Apache 2 License.

NVIDIA Cosmos models are released under the NVIDIA Open Model License. For a custom license, please contact [email protected].

About

Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages