Merge pull request #6 from boomb0om/dev

Installation fix
boomb0om · Sep 9, 2023 · 95d2e14 · 95d2e14
2 parents 68849bd + dfa53e0
commit 95d2e14
Show file tree

Hide file tree

Showing 2 changed files with 53 additions and 14 deletions.
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 This project aims to unify the evaluation of generative text-to-image models and provide the ability to quickly and easily calculate most popular metrics.
 
 Goals of this benchmark:
-- **Unified** metrics and datasets for all models
+- **Unified** metrics and datasets for all text-to-image models
 - **Reproducible** results
 - **User-friendly** interface for most popular metrics: FID and CLIP-score
 
@@ -17,6 +17,7 @@ Goals of this benchmark:
 - [Examples](#examples)
 - [Documentation](#documentation)
 - [Contribution](#contribution)
+- [TO-DO](#to-do)
 - [Contacts](#contacts)
 - [Citing](#citing)
 - [Acknowledgments](#acknowledgments)
@@ -25,8 +26,8 @@ Goals of this benchmark:
 
 Generative text-to-image models have become a popular and widely used tool for users. 
 There are many articles on the topic of image generation from text that present new, more advanced models.
-However, there is still no uniform way to measure the quality of such models.
-To address this issue, we provide an implementation of metrics to compare the quality of generative models.
+**However, there is still no uniform way to measure the quality of such models**.
+To address this issue, we provide an implementation of metrics and a dataset to compare the quality of generative models.
 
 We propose to use the metric MS-COCO FID-30K with OpenAI's CLIP score, which has already become a standard for measuring the quality of text2image models. 
 We provide the MS-COCO validation subset and precalculated metrics for it. 
@@ -38,18 +39,19 @@ You can easily contribute your model into benchmark and make FID results reprodu
 
 - Standardized FID calculation: fixed image preprocessing and InceptionV3 model.
 - FID-30k on MS-COCO validation set: we provide dataset on [huggingface🤗](https://huggingface.co/datasets/stasstaf/MS-COCO-validation), [precomputed FID stats](https://github.com/boomb0om/text2image-benchmark/releases/download/v0.0.1/MS-COCO_val2014_fid_stats.npz), fixed [30000 captions from MS-COCO](https://github.com/boomb0om/text2image-benchmark/releases/download/v0.0.1/MS-COCO_val2014_30k_captions.csv) that should be used to generate images
+- Implementations of different popular text-to-image models to make metrics **reproducible**
 - CLIP-score calculation
 - User-friendly metrics calculation (checkout [Getting started](#getting-started))
 
 ## Installation
 
 ```bash
+pip install git+https://github.com/openai/CLIP.git
 pip install git+https://github.com/boomb0om/text2image-benchmark
 ```
 
 ## Getting started
 
-
 ### Metrics: FID
 
 Calculate FID for two sets of images:
@@ -80,15 +82,28 @@ pip install -r T2IBenchmark/models/kandinsky21/requirements.txt
 ```
 
 ```python
-from T2IBenchmark import calculate_fid
-from T2IBenchmark.datasets import get_coco_fid_stats
+from T2IBenchmark import calculate_coco_fid
+from T2IBenchmark.models.kandinsky21 import Kandinsky21Wrapper
 
-fid, _ = calculate_fid(
-    'path/to/your/generations/',
-    get_coco_fid_stats()
+fid, fid_data = calculate_coco_fid(
+    Kandinsky21Wrapper,
+    device='cuda:0',
+    save_generations_dir='coco_generations/'
 )
 ```
 
+### Metrics: CLIP-score
+
+Example of calculating CLIP-score for a set of images and fixed prompt:
+
+```python
+from T2IBenchmark import calculate_clip_score
+from glob import glob
+
+cat_paths = glob('../assets/images/cats/*.jpg')
+captions_mapping = {path: "a cat" for path in cat_paths}
+clip_score = calculate_clip_score(cat_paths, captions_mapping=captions_mapping)
+```
 
 ## Project Structure
 
@@ -98,25 +113,50 @@ fid, _ = calculate_fid(
   - `feature_extractors/` - Implementation of different neural nets used to extract features from images
   - `metrics/` - Implementation of metrics
   - `utils/` - Some utils
+- `tests/` - Tests
 - `docs/` - Documentation
-- `examples/` - Usage examples
-- `experiments/` - Experiments
+- `examples/` - Benchmark usage examples
+- `experiments/` - Experiments with metrics
 - `assets/` - Assets
 
 ## Examples
 
+Examples of use are listed below in recommended order for study:
 
+- [Basic FID usage](examples/FID_basic.ipynb)
+- [Advanced FID usage](examples/FID_advanced.ipynb)
+- [CLIP score](examples/CLIP_score_usage.ipynb)
+- [FID calculation on MS-COCO](examples/FID-30k_on_MS-COCO.ipynb)
+- [Using ModelWrapper to measure MS-COCO FID-30k](examples/ModelWrapper_FID-30k.ipynb)
 
 ## Documentation
 
-
+- [FID.md](docs/FID.md) - Explanation of different parameters that affects FID calculation
 
 ## Contribution
 
+If you want to contribute your model into this benchmark and publish metrics, follow these steps:
 
+1) Create a fork of this repository
+2) Create a wrapper for your model that inherits `T2IModelWrapper` class
+3) Generate images and calculate metrics using `calculate_coco_fid`. For more information see [this example](examples/ModelWrapper_FID-30k.ipynb) 
+4) Create a pull request with your model
+5) Congrats!
+
+## TO-DO
+
+- [ ] Implementation of Inception Score (IS) and Kernel Inception Distance (KID)
+- [ ] FID-CLIPscore metric and plots
+- [ ] Implementation and FIDs for [Kandinsky 2.X](https://github.com/ai-forever/Kandinsky-2) models with the help of Sber AI 
+- [ ] Implementation and FIDs for popular models from [diffusers](https://github.com/huggingface/diffusers): Stable Diffusion, IF
 
 ## Contacts
 
+Authors: 
+- Pavlov Igor, [github](https://github.com/boomb0om)
+- Artyom Ivanov, [github](https://github.com/UsefulTornado)
+- Stanislav Stafievskiy, [github](https://github.com/stasstaf)
+
 If you have any question, please email `[email protected]`.
 
 ## Citing

diff --git a/requirements.txt b/requirements.txt
@@ -8,5 +8,4 @@ pillow
 datasets
 opencv-python
 ftfy
-regex
-git+https://github.com/openai/CLIP.git
+regex