Welcome to GeneFace v1.1.0

We have made GeneFace more practical for industrial usage!

We implement RAD-NeRF renderer, which could infer in real-time and be trained in 10 hours.
We turn to pytorch-based deep3d_recon module to extract 3DMM, which is easier to install and is 8x faster than the previous TF-based version.
We provide a pitch-aware audio2motion module, which could generate more lip-sync landmark.
Fix some bugs that cause large memory usage.
We will upload the paper about this release soon.

lrs3.zip includes the models trained on LRS3-TED dataset (a lm3d_vae_sync to perform the audio2motion transform and a syncnet for measuring the lip-sync), which are generic for all possible target person videos.
May.zip includes the models trained on the May.mp4 target person video (a lm3d_postnet_sync for refining the predicted 3d landmark, a lm3d_radnerf for rendering the head image, and a lm3d_radnerf_torso for rendering the torso part). For each target person video, you need to train these three models.
How to use the pretrained models: unzip the lrs3.zip and May.zip into the checkpoints directory, then follow the commandlines for inference in README.md

LRS3-TED: We provide the processed lrs3 dataset on Google Drive. Download link: Part1, Part2.
Baiduyun Disk for LRS3-TED: link, password lrs3

Provide feedback