Skip to content

[CVPR2023] OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering.

Notifications You must be signed in to change notification settings

theEricMa/OTAvatar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

c8c484f · Mar 5, 2024

History

54 Commits
Mar 30, 2023
Apr 12, 2023
Mar 30, 2023
Mar 9, 2023
Jan 21, 2024
Mar 30, 2023
Apr 30, 2023
Mar 30, 2023
Apr 12, 2023
Apr 30, 2023
Sep 5, 2023
Mar 30, 2023
Mar 5, 2024
Mar 30, 2023
May 2, 2023
Apr 8, 2023
Mar 30, 2023
Apr 30, 2023
Mar 30, 2023

Repository files navigation

OTAvatar : One-shot Talking Face Avatar with Controllable Tri-plane Rendering

Update

April.30: The model weight is released. The dataset is also available in Google Drive, see below for detail.

April.4: The preprocessed dataset is released, please see the Data preparation section. Some missing files are also uploaded.

Get started

Environment Setup

git clone git@github.com:theEricMa/OTAvatar.git
cd OTAvatar
conda env create -f environment.yml
conda activate otavatar

Pre-trained Models

Download and copy EG3D FFHQ model ffhqrebalanced512-64.pth [Baidu Netdisk][Google Drive] to the pretrained directory. It is the ffhqrebalanced512-64.pkl file obtained from webpage, and converted to .pth format using the pkl2pth script.

Download arcface_resnet18.pth and save to the pretrained directory.

Data preparation

We upload the processed dataset hdtf_lmdb_inv in [Baidu Netdisk][Google Drive]. In the root directory,

mkdir datasets
mv <your hdtf_lmdb_inv path> datasets/

Generally the processing scripts is a mixture of that in PIRenderer and ADNeRF. We plan to further open a new repo to upload our revised preocessing script.

Face Animation

Create the folder result/otavatarif it does not exist. Place the model downloaded from [Baidu Netdisk][Google Drive] under this directory. Run,

export CUDA_VISIBLE_DEVICES=0
python -m torch.distributed.launch --nproc_per_node=1 --master_port 12345 inference_refine_1D_cam.py \
--config ./config/otavatar.yaml \
--name otavatar \
--no_resume \
--which_iter 2000 \
--image_size 512 \
--ws_plus \
--cross_id \
--cross_id_target WRA_EricCantor_000 \
--output_dir ./result/otavatar/evaluation/cross_ws_plus_WRA_EricCantor_000

To animate each identity given the motion from WRA_EricCantor_000.

Or simply run,

sh scripts/inference.sh

Start Training

Run,

export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m torch.distributed.launch --nproc_per_node=4 --master_port 12346 train_inversion.py \
--config ./config/otavatar.yaml \
--name otavatar

Or simply run,

sh scripts/train.sh

Acknowledgement

We appreciate the model or code from EG3D, PIRenderer, StyleHEAT, EG3D-projector.

Citation

If you find this work helpful, please cite:

@article{ma2023otavatar,
  title={OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering},
  author={Ma, Zhiyuan and Zhu, Xiangyu and Qi, Guojun and Lei, Zhen and Zhang, Lei},
  journal={arXiv preprint arXiv:2303.14662},
  year={2023}
}