StyleMaster: Stylize Your Video with Artistic Generation and Translation
teaser.mp4
Zixuan Ye1 †, Huijuan Huang2✉, Xintao Wang2, Pengfei Wan2, Di Zhang2, Wenhan Luo1✉
1 Hong Kong University of Science and Technology
2 Kuaishou Technology
† Intern at KwaiVGI, Kuaishou Technology
✉ Corresponding Author
- Implementation on videotuna
- Evaluation code and results
- Code of style extraction module
- Illusion dataset generation
- [2025.2] StyleMaster has been accepted by CVPR2025!
- [2024.10] arXiv preprint is available.
Welcome to StyleMaster! StyleMaster focuses on style control, i.e., generating or translating a video to match the style of a given reference image. StyleMaster preserves local textures and enhance global style representations. Additionally, a motion adapter and gray tile ControlNet are employed to enhance motion quality and provide precise content guidance.
- Local Patch Selection: Overcomes content leakage in style transfer by selecting patches with less similarity to text prompts.
- Global Style Extraction: Uses a projection module after CLIP supervised by illusion datasets.
- Motion Adapter: Enhances motion quality during inference and helps to enhance the style extent.
- Gray Tile ControlNet: Provides accessible yet precise content guidance for video style transfer.
- High-Quality Video Generation: Generates videos with high style similarity to the reference image and achieves ideal translation results.
Please refer to visual_anagrams/readme.md for details.
Please refer to style_extraction for details.
cd style_extraction
python style_extraction_module.py
We show the complete results generated by our method and other baselines in Google Drive
@article{ye2024stylemaster,
title={StyleMaster: Stylize Your Video with Artistic Generation and Translation},
author={Ye, Zixuan and Huang, Huijuan and Wang, Xintao and Wan, Pengfei and Zhang, Di and Luo, Wenhan},
journal={arXiv preprint arXiv:2412.07744},
year={2024}
}