🕵🏻‍♂️ DLA DEEPFAKE DETECTION 2024/25 - UNICA

Deepfake Detection Project using the OpenForensics dataset

📑 Summary

🧑🏻‍🎓 Students

📌 Description

📥 Download the Dataset

📄 Documentation

🚀 Installation

🛠️ Test the DataLoader

🎯 Train the Model

📊 Evaluate the Model

📂 Project Structure

📊 Project Goals

🖥️ Hardware and Limitations

🤝 Contributions

❓ How to Cite

🧑🏻‍🎓 Students

Francesco Congiu

Student ID: 60/73/65300

E-Mail: [email protected]

Simone Giuffrida

Student ID: 60/73/65301

E-Mail: [email protected]

Fabio Littera

Student ID: 60/73/65310

E-Mail: [email protected]

📌 Description

This repository contains the code for training and evaluating deepfake detection models using the OpenForensics dataset. The project follows two approaches:

Transfer Learning with pre-trained models (e.g., MobileNet, Xception).
Training from Scratch with a custom neural network.

📥 Download the Dataset

The OpenForensics dataset required for the project can be downloaded from the following link:
🔗 OpenForensics Dataset - Zenodo

📄 Documentation

Below are links to the full project documentation:

📂 Cartella con tutta la documentazione

🚀 Installation

To run the project locally, follow these steps:

1️⃣ Clone the Repository

Open the terminal and run:

git clone [email protected]:wakaflocka17/DLA_DEEPFAKEDETECTION.git
cd DLA_DEEPFAKEDETECTION

(Or, if using HTTPS)

git clone https://github.com/wakaflocka17/DLA_DEEPFAKEDETECTION.git
cd DLA_DEEPFAKEDETECTION

2️⃣ Create and Activate a Virtual Environment

It is recommended to create a virtual environment to isolate dependencies:

python3 -m venv openforensics_env
source openforensics_env/bin/activate  # macOS/Linux

(On Windows, use: openforensics_env\Scripts\activate)

3️⃣ Install Dependencies

Install all necessary libraries:

pip install -r requirements.txt

4️⃣ Set Up the Project Structure

First, however, we make the script executable with the command:

chmod +x setup_folders.sh

Run the following script to create the required folders:

setup_folders.sh

This will create:

DLA_DEEPFAKEDETECTION/
│── data/
│   ├── Train/
│   ├── Val/
│   ├── Test-Dev/
│   ├── Test-Challenge/
│   ├── dataset/
│
│── processed_data/
│   ├── Train/
│   │   ├── real/
│   │   ├── fake/
│   ├── Val/
│   │   ├── real/
│   │   ├── fake/
│   ├── Test-Dev/
│   │   ├── real/
│   │   ├── fake/
│   ├── Test-Challenge/
│   │   ├── real/

5️⃣ Download the Dataset

To automatically download the OpenForensics dataset, use the provided script:

python3 scripts/download_dataset.py

💡 Ensure you have a stable internet connection, as the dataset is large (60GB+).

6️⃣ Move Images and JSON Files to Their Correct Directories

Now that all files have been extracted, we need to organize them into the correct dataset folders (Train, Val, Test-Dev, Test-Challenge). Run:

python3 scripts/extract_dataset.py

💡 This will:

Move training images to data/Train/Train/ and the corresponding Train_poly.json to data/Train/.
Move validation images to data/Val/Val/ and Val_poly.json to data/Val/.
Move test-dev images to data/Test-Dev/Test-Dev/ and Test-Dev_poly.json to data/Test-Dev/.
Move test-challenge images to data/Test-Challenge/Test-Challenge/ and Test-Challenge_poly.json to data/Test-Challenge/.

7️⃣ Delete Unnecessary ZIP Files

After extraction and organization, the original .zip files are no longer needed. Delete them using:

python3 scripts/delete_all_zips.py

💡 This will clean up the dataset directory, saving storage space.

8️⃣ Verify Installation

To check if everything works correctly, run:

python3 -c "import torch; print(torch.__version__)"
python3 -c "import cv2; print(cv2.__version__)"

If no errors appear, the setup is complete! 🎯

🛠️ Test the DataLoader

Before training, verify that the dataset is correctly loaded:

python3 scripts/dataloader.py --dataset Train --batch_size 32

💡 This should display a batch of images and labels.

🎯 Train the Model

Train the model using either MobileNet or Xception:

✅ Train with MobileNet:

python3 scripts/train.py --model mobilenet

✅ Train with Xception:

python3 scripts/train.py --model xception

✅ Train with Custom network:

python3 scripts/train.py --model custom

💡 The trained model will be saved in the models/ directory.

📊 Evaluate the Model

After training, evaluate the model on Test-Dev and Test-Challenge:

✅ Evaluate MobileNet on Test-Dev:

python3 scripts/evaluate.py --model mobilenet --dataset Test-Dev

✅ Evaluate MobileNet on Test-Challenge:

python3 scripts/evaluate.py --model mobilenet --dataset Test-Challenge

✅ Evaluate Xception on Test-Dev:

python3 scripts/evaluate.py --model xception --dataset Test-Dev

✅ Evaluate Xception on Test-Challenge:

python3 scripts/evaluate.py --model xception --dataset Test-Challenge

✅ Evaluate Custom network on Test-Dev:

python3 scripts/evaluate.py --model custom --dataset Test-Dev

✅ Evaluate Custom network on Test-Challenge:

python3 scripts/evaluate.py --model custom --dataset Test-Challenge

💡 The script will print Accuracy, Precision, Recall, and F1-score.

📂 Project Structure

DLA_DEEPFAKEDETECTION/
│── .github/            # DependenciesBot
│── data/               # Dataset OpenForensics (originale, non modificato)
│   ├── Train/          # Training Data
│   ├── Val/            # Evaluation Data
│   ├── Test-Dev/       # Test-Dev Data
│   ├── Test-Challenge/ # Test-Challenge Data
│   ├── dataset/        # How to save the original dataset
│
│── processed_data/     # Preprocessing output (cropped faces)
│   ├── Train/
│   │   ├── real/       # Real faces extracted from the training set
│   │   ├── fake/       # Fake faces extracted from the training set
│   ├── Val/
│   │   ├── real/       # Real faces extracted for evaluation
│   │   ├── fake/       # Fake faces extracted for evaluation
│   ├── Test-Dev/
│   │   ├── real/       # Real faces extracted for Test-Dev
│   │   ├── fake/       # Fake faces extracted for Test-Dev
│   ├── Test-Challenge/
│   │   ├── real/       # Real faces extracted for Test-Challenge
│   │   ├── fake/       # Fake faces extracted for Test-Challenge
│
│── documentation/      # Documentation, reports, extra material
│── logs/               # Folder to track the accuracy of the assessment and the loss you have during training
│── models/             # Saved models (es. file .pth)
│── scripts/            # Scripts (training, preprocessing, ecc.)
│── notebooks/          # Jupyter Notebook for debugging and testing
│── utils/              # Generic utilities and support functions
│── requirements.txt    # Project dependencies
│── setup_folders.sh    # Script for automatic creation of folders
│── README.md           # Project documentation

📊 Project Goals

✅ Face extraction from images using bounding boxes.
✅ Binary classification (fake/real) of extracted faces.
✅ Training with transfer learning using MobileNet or Xception.
✅ Development of a custom CNN for classification.

🖥️ Hardware and Limitations

Note

The experiments were performed on a MacBook Pro (2024) with the following specifications:

Operating system: macOS Sonoma;
Processor: Apple M4 Pro;
GPU: Apple integrated GPU (M4 Pro);
RAM: 32 GB (unified memory);

Warning

Due to the size and computational complexity of the dataset, it is possible that some experiments may be slower or difficult to execute on systems with fewer resources or less performing hardware.

🤝 Contributions

Feel free to contribute to the project! 💡

📌 How to Contribute

Fork the repository.
Create a new branch:
```
git checkout -b new-feature
```
Commit your changes:
```
  git commit -m "Add new feature"
```
Push your changes:
```
  git push origin new-feature
```
Open a Pull Request on GitHub.

❓ How to Cite

If you use this repository (or part of its code) for your research, a scholarly publication, or a project, please kindly cite us. You can use the following BibTeX entry:

@misc{Deepfake-Project,
  author       = {Congiu F., Giuffrida S., Littera F.},
  title        = {Deepfake Detection Project using the OpenForensics dataset},
  howpublished = {\url{https://github.com/wakaflocka17/DLA_DEEPFAKEDETECTION}},
  year         = {2025}
}

Or, if you prefer not to use BibTeX, feel free to mention the authors and the link to the repository in the acknowledgments or bibliography of your paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕵🏻‍♂️ DLA DEEPFAKE DETECTION 2024/25 - UNICA

📑 Summary

🧑🏻‍🎓 Students

Francesco Congiu

Simone Giuffrida

Fabio Littera

📌 Description

📥 Download the Dataset

📄 Documentation

🚀 Installation

1️⃣ Clone the Repository

2️⃣ Create and Activate a Virtual Environment

3️⃣ Install Dependencies

4️⃣ Set Up the Project Structure

5️⃣ Download the Dataset

6️⃣ Move Images and JSON Files to Their Correct Directories

7️⃣ Delete Unnecessary ZIP Files

8️⃣ Verify Installation

🛠️ Test the DataLoader

🎯 Train the Model

📊 Evaluate the Model

📂 Project Structure

📊 Project Goals

🖥️ Hardware and Limitations

🤝 Contributions

📌 How to Contribute

❓ How to Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
.github		.github
data		data
documentation		documentation
logs		logs
models		models
scripts		scripts
src		src
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup_folders.sh		setup_folders.sh

wakaflocka17/DLA_DEEPFAKEDETECTION

Folders and files

Latest commit

History

Repository files navigation

🕵🏻‍♂️ DLA DEEPFAKE DETECTION 2024/25 - UNICA

📑 Summary

🧑🏻‍🎓 Students

Francesco Congiu

Simone Giuffrida

Fabio Littera

📌 Description

📥 Download the Dataset

📄 Documentation

🚀 Installation

1️⃣ Clone the Repository

2️⃣ Create and Activate a Virtual Environment

3️⃣ Install Dependencies

4️⃣ Set Up the Project Structure

5️⃣ Download the Dataset

6️⃣ Move Images and JSON Files to Their Correct Directories

7️⃣ Delete Unnecessary ZIP Files

8️⃣ Verify Installation

🛠️ Test the DataLoader

🎯 Train the Model

📊 Evaluate the Model

📂 Project Structure

📊 Project Goals

🖥️ Hardware and Limitations

🤝 Contributions

📌 How to Contribute

❓ How to Cite

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages