Maximum Manifold Capacity Representations

This is a clean, opinionated and minimalist implementation of the training and evaluation protocols for a fancy new self supervised learning (SSL) method called maximum manifold capacity representations. MMCR was introduced in Efficient Coding of Natural Images using Maximum Manifold Capacity Representations paper. This repository is not a strict reimplementation of the experiments from the paper, rather just a playground to try this new SSL approach. If you want to check out the official implementation please see this repository. Currently only resnet18 model and cifar10 dataset are supported, but it should be rather easy to add more models and datasets.

Installation

You will need at least Python 3.10.

# Download repository
git clone [email protected]:bartoszzuk/mmcr.git
cd mmcr

# [Optional] Setup virtual environment
python3 -m venv venv
source venv/bin/activate

# Install requirements
pip install -r requirements.txt

Pretraining

Just use a pretrain.py script as shown below.

python pretrain.py \
    --dataset cifar10       # path to the dataset root
    --batch-size 64         # number of images per step
    --num-views 16          # number of augmented views per image
    --max-epochs 500        # number of training epochs
    --learning-rate 0.001   # learning rate for Adam optimizer
    --warmup-duration 0.1   # warmup duration for cosine scheduler
    --projection-dim 128    # dimension for output of projection head
    --num-neighbours 200    # number of neighbours to use for KNN evaluation
    --temperature 0.5       # temperature to use for KNN evaluation

Some details worth noting:

The effective batch size is computed as batch_size * num_views.
We use cosine scheduler with warmup instead of keeping the learning rate constant like in the original paper.
We use GaussianBlur and RandomSolarize as additional augmentation methods.

Linear Evaluation

Just use a evaluate.py script as shown below. It will train a single linear layer on top of the frozen pretrained encoder.

python evaluate.py \
    --checkpoint [YOUR_PRETRAINED_MODEL_CHECKPOINT]
    --dataset cifar10       # path to the dataset root
    --batch-size 1024       # number of images per step
    --max-epochs 50         # number of training epochs
    --learning-rate 0.01    # learning rate for Adam optimizer
    --warmup-duration 0.1   # warmup duration for cosine scheduler

Some details worth noting:

We use cosine scheduler with warmup instead of just cosine scheduler like in the original paper.
We use GaussianBlur and RandomSolarize as additional augmentation methods.

Experiments

I ran both the pretrain.py and evaluate.py script using the values shown in the examples above. I did not conduct a proper hyperparameter optimization, just used the values that felt good 😬.

Pretrain Graphs

Linear Evaluate Graphs

Metrics

Stage	Model	Dataset	Top1 Accuracy	Top5 Accuracy
Pretrain	ResNet18	Cifar10	86.32	99.39
Linear Evaluate (Our)	ResNet18	Cifar10	89.19	99.61
Linear Evaluate (Paper)	ResNet50	Cifar10	93.53	-

The results from the paper are shown just for reference. Since we are using smaller model and different hyperparameters they are not very comparable.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
checkpoints		checkpoints
figures		figures
mmcr		mmcr
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
pretrain.py		pretrain.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Maximum Manifold Capacity Representations

Installation

Pretraining

Linear Evaluation

Experiments

Pretrain Graphs

Linear Evaluate Graphs

Metrics

About

Languages

bartoszzuk/mmcr

Folders and files

Latest commit

History

Repository files navigation

Maximum Manifold Capacity Representations

Installation

Pretraining

Linear Evaluation

Experiments

Pretrain Graphs

Linear Evaluate Graphs

Metrics

About

Topics

Resources

Stars

Watchers

Forks

Languages