A modular implementation of "One Shot Face Swapping on Megapixels (CVPR 2021)" with enhanced debugging capabilities, comprehensive configuration management, and improved maintainability.
- Reference:
zyainfal/One-Shot-Face-Swapping-on-Megapixels - Paper:
One Shot Face Swapping on Megapixels(arXiv:2105.04932)
- NVIDIA GPU (RTX 30xx/40xx or A100 recommended) with CUDA 11.x or 12.x drivers
- Python 3.10+
- Git
-
Clone the repository:
git clone https://github.com/n01r1r/MegaFS.git cd MegaFS -
Create a virtual environment (highly recommended):
# Create virtual environment python -m venv venv # Activate it # Windows: .\venv\Scripts\activate # macOS/Linux: source venv/bin/activate
-
Install dependencies:
For CUDA 12.x:
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu121
For CUDA 11.x:
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu118
For CPU-only installation:
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu
Why
--extra-index-url? This ensures pip pulls PyTorch packages from the official PyTorch server, which resolves dependency conflicts with NumPy and other packages. -
Verify installation:
python -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}')"
MegaFS/
├── models/ # Core model implementations
│ ├── megafs.py # Main MegaFS class (with gradient support)
│ ├── hierfe.py # Hierarchical Region Feature Encoder
│ ├── face_transfer.py # Face transfer modules (FTM, Injection, LCR)
│ ├── stylegan2.py # StyleGAN2 generator
│ ├── model_factory.py # Model creation factory
│ └── weight_loaders.py # Weight loading utilities
├── data/ # Dataset integration (NEW)
│ ├── __init__.py
│ └── face_swap_dataset.py # PyTorch Dataset with train/val/test splits
├── training/ # Training infrastructure (NEW)
│ ├── __init__.py
│ ├── trainer.py # Base trainer for experiments
│ └── example_experiment.py # Example experiment template
├── configs/ # Configuration files (NEW)
│ ├── default.yaml # Default configuration
│ └── experiment.yaml # Experiment configuration
├── utils/ # Utility modules
│ ├── data_utils.py # Data management and mapping
│ ├── image_utils.py # Image processing utilities
│ ├── debug_utils.py # Debugging and profiling tools
│ └── metrics.py # Image similarity evaluation metrics
├── config.py # Configuration management (with YAML support)
├── create_datamap.py # Dataset mapping utility
├── MegaFS.ipynb # Interactive Colab notebook
├── MegaFS_Evaluation.ipynb # Image similarity evaluation notebook
└── requirements.txt # Python dependencies
- Open the notebook:
MegaFS.ipynb - Upload your dataset to Google Drive:
- Upload
celeba_mask_hq.zipto/content/drive/MyDrive/Datasets/
- Upload
- Upload weight files to Google Drive:
- Place all weight files in
/content/drive/MyDrive/Datasets/weights/
- Place all weight files in
- Run the notebook: Everything will be set up automatically
# Basic usage with default settings
python run_local.py
# Run with custom IDs
python run_local.py --src-id 100 --tgt-id 200
# Use different swap method
python run_local.py --swap-type injection
# Custom dataset and weights paths
python run_local.py --dataset-root ./my_dataset --weights-dir ./my_weights
# Enable gradients for experiments
python run_local.py --enable-grads
# See all options
python run_local.py --helpKey Options:
--src-id,--tgt-id: Source and target image IDs--swap-type: Swap method (ftm,injection,lcr)--dataset-root: Path to dataset (default:./dataset/CelebAMask-HQ)--weights-dir: Path to weights (default:./weights)--output-dir: Output directory (default:./outputs)--no-refine: Faster processing without refinement--enable-grads: Enable gradients for experiments
from config import DEFAULT_CONFIGS
from models.megafs import MegaFS
# Use predefined configuration
config = DEFAULT_CONFIGS["local"] # or "colab" for Colab environment
# Initialize MegaFS with configuration
megafs = MegaFS(
config=config,
debug=True # Enable debug logging
)
# Run face swap
result_path, result_image = megafs.run(
src_idx=100,
tgt_idx=200,
refine=True,
save_path="result.jpg"
)The framework includes comprehensive image similarity evaluation capabilities with multiple metrics:
- LPIPS: Learned Perceptual Image Patch Similarity (lower is better)
- PSNR: Peak Signal-to-Noise Ratio (higher is better)
- SSIM: Structural Similarity Index (higher is better, range [0,1])
- MSE: Mean Squared Error (lower is better)
Use MegaFS_Evaluation.ipynb for comprehensive evaluation:
- Open the evaluation notebook in Google Colab
- Upload your dataset and weight files to Google Drive
- Configure evaluation parameters:
- Evaluation size (number of image pairs)
- Swap methods to compare (FTM, Injection, LCR)
- Refinement settings
- Run evaluation - automatically processes all methods
- View results - statistical analysis and visualizations
from utils.metrics import ImageMetrics, FaceSwapEvaluator
from models.megafs import MegaFS
# Initialize evaluator
evaluator = FaceSwapEvaluator(use_gpu=True)
# Run face swap evaluation
results = evaluator.evaluate_pair(source_img, target_img, swapped_img, refined_img)
# Calculate statistics across multiple results
stats = evaluator.calculate_statistics(all_results)# Evaluate multiple image pairs
batch_results = run_batch_evaluation(
handler_instance=megafs_handler,
id_pairs=[(100, 200), (300, 400), (500, 600)],
refine=True,
max_pairs=50
)
# Generate comprehensive statistics
statistics = evaluator.calculate_statistics(batch_results)The framework now supports gradient-based experiments for research purposes, including adversarial attacks with improved visual quality.
The adversarial attack implementation uses three loss components for imperceptible perturbations:
- L_ID (Identity Destruction): Minimizes cosine similarity to destroy identity in face region
- L_SIM (Similarity Preservation): Maintains visual similarity using LPIPS perceptual loss (recommended) or MSE/L1
- L_TV (Total Variation): Encourages smooth perturbations by minimizing adjacent pixel differences
Total Loss = λ₁ × L_ID + λ_sim × L_SIM + λ_tv × L_TV
Note: L_SEM (Semantic Collapse) has been removed to improve visual quality while maintaining attack effectiveness.
- LPIPS Support: Perceptual similarity loss for better visual quality preservation
- Total Variation Loss: Smooth perturbations without high-frequency noise
- Early Stopping: Automatic termination when L_ID < 0.2 threshold is reached
- Optimized Hyperparameters: Reduced iterations (300) and epsilon (8.0) for faster, cleaner attacks
from config import Config
from models.megafs import MegaFS
import torch
# Load configuration from YAML
config = Config.from_yaml('configs/experiment.yaml')
# Initialize with device and gradients
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = MegaFS(config=config, enable_grads=True, device=device)
# Use standard PyTorch methods
model.train() # Enable gradients
model.eval() # Disable gradientsfrom data import create_dataloaders
from models.megafs import MegaFS
import torch
# Create dataloaders with train/val/test splits
# Option 1: Use data_map.json (recommended for CelebA-HQ)
dataloaders = create_dataloaders(
dataset_root='./dataset/CelebAMask-HQ',
data_map_path='./data_map.json',
batch_size=8,
num_workers=4
)
# Option 2: Auto-discover from folder structure (for custom datasets)
dataloaders_custom = create_dataloaders(
dataset_root='./my_custom_images',
use_data_map=False, # No data_map.json needed
batch_size=8
)
# Use in experiments
device = 'cuda' if torch.cuda.is_available() else 'cpu'
for batch in dataloaders['train']:
source = batch['source'].to(device).requires_grad_(True)
target = batch['target'].to(device)
# Forward pass with gradients
output = model.forward(source, target)
# Compute loss and backpropagate
loss = your_loss_function(output, target)
loss.backward()from training import BaseTrainer
from training.example_experiment import ExampleExperiment
# Extend BaseTrainer for custom experiments
class MyExperiment(BaseTrainer):
def compute_loss(self, source, target, batch):
output = self.model.forward(source, target)
return your_custom_loss(output, target)
# Create trainer
trainer = MyExperiment(
model=model,
dataloaders=dataloaders,
device='cuda'
)
# Run training
trainer.fit(num_epochs=10)
# Test
test_metrics = trainer.test()Create configs/experiment.yaml:
# Experiment configuration
swap_type: ftm
dataset_root: ./dataset/CelebAMask-HQ
checkpoint_dir: ./weights
experiment:
enable_grads: true # Enable gradients
batch_size: 4
num_workers: 2
seed: 42
data_split:
train: 0.7
val: 0.15
test: 0.15Load configuration:
config = Config.from_yaml('configs/experiment.yaml')The modular configuration system supports multiple environments:
from config import Config, DEFAULT_CONFIGS
# Use predefined configurations
config = DEFAULT_CONFIGS["local"] # Local development
config = DEFAULT_CONFIGS["colab"] # Google Colab
# Or create custom configuration
config = Config(
swap_type="ftm", # "ftm", "injection", or "lcr"
dataset_root="./CelebAMask-HQ",
img_root="./CelebAMask-HQ/CelebA-HQ-img",
mask_root="./CelebAMask-HQ/CelebAMask-HQ-mask-anno",
checkpoint_dir="./weights"
)- CelebA-HQ: High-quality face images
- Structure:
CelebA-HQ-img/<id>.jpg
- Structure:
- CelebAMask-HQ: Segmentation masks
- Structure:
CelebAMask-HQ-mask-anno/*/<id>_*.png
- Structure:
The codebase uses a data mapping system for robust path resolution. The DataMapManager class handles automatic path resolution for images and masks:
from utils.data_utils import DataMapManager
# Initialize data manager
data_manager = DataMapManager("data_map.json")
# Resolve paths for specific IDs
image_path, mask_path = data_manager.resolve_paths_for_id(100, dataset_root)Generate dataset mapping:
# Run from dataset root directory
python create_datamap.pyThis creates data_map.json with automatic path mapping that the MegaFS class uses internally.
Place the following weight files in the weights/ directory:
- MegaFS checkpoints:
{swap_type}_final.pthftm_final.pthinjection_final.pthlcr_final.pth
- StyleGAN2 generator:
stylegan2-ffhq-config-f.pth
Note: Weight files are not included. Obtain from official sources or train your own models.
-
HieRFE (Hierarchical Region Feature Encoder)
- ResNet50 backbone with FPN
- Multi-scale feature extraction
- StyleMapping layers for latent generation
-
FaceTransferModule
- FTM: Transfer Cell with multiple blocks
- Injection: ID injection with normalization
- LCR: Latent Code Regularization
-
StyleGAN2 Generator
- High-resolution face synthesis
- 1024x1024 output resolution
- 18 latent dimensions
- Preprocessing: Load and resize images to 256x256
- Encoding: Extract hierarchical features with HieRFE
- Transfer: Apply face transfer using selected method
- Generation: Synthesize high-resolution result with StyleGAN2
- Postprocessing: Apply mask blending and refinement
The modular design includes comprehensive debugging tools:
# Enable debug logging
megafs = MegaFS(config=config, debug=True)
# Access debug utilities
megafs.debug_logger.log("Custom message")
megafs.profiler.start_timer("operation")
# ... perform operation ...
duration = megafs.profiler.end_timer("operation")# Basic face swap
result_path, result_image = megafs.run(
src_idx=100, # Source image ID
tgt_idx=200, # Target image ID
refine=True, # Apply refinement
save_path="swap_result.jpg"
)# Process multiple pairs
pairs = [(100, 200), (300, 400), (500, 600)]
for src_id, tgt_id in pairs:
result_path, result_image = megafs.run(
src_idx=src_id,
tgt_idx=tgt_id,
refine=True
)# Advanced configuration
from config import Config
config = Config(
swap_type="injection",
dataset_root="/path/to/dataset",
img_root="/path/to/images",
mask_root="/path/to/masks",
checkpoint_dir="/path/to/weights"
)
megafs = MegaFS(config=config, debug=True)This is an unofficial implementation focused on modularity and maintainability. Contributions are welcome for:
- Bug fixes and improvements
- Additional swap methods
- Performance optimizations
- Documentation enhancements
- Method: Based on CVPR 2021 paper "One Shot Face Swapping on Megapixels"
- Datasets: CelebA-HQ is non-commercial; follow original licenses
- Usage: Research and educational purposes only
- Compliance: Ensure adherence to original dataset and model licenses
- Original paper authors and the reference implementation
- StyleGAN2 authors for the generator architecture
- CelebA-HQ and CelebAMask-HQ dataset creators