Skip to content

Commit 674236b

Browse files
stephenleomonatis
andauthored
adding in sub-headings for CV, NLP and multi-modal (#6)
* Add subheadings for CV, NLP and multi-modal. * Add ANN, some papers and new dataset. * Fix language style and some wordings. Co-authored-by: M. Yusuf Sarıgöz <[email protected]>
1 parent 3d554c5 commit 674236b

File tree

1 file changed

+62
-0
lines changed

1 file changed

+62
-0
lines changed

README.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,34 @@ recommender algorithms.</summary>
120120
> It supports incorporating user and item features to the traditional matrix factorization. It represents users and items as a sum of the latent representations of their features, thus achieving a better generalization.
121121
</details>
122122
123+
## Approximate Nearest Neighbors ⚡
124+
<details>
125+
<summary><a href="https://github.com/erikbern/ann-benchmarks">ANN Benchmarks</a> - Benchmarking various ANN implementations for different metrics.</summary>
126+
127+
> It provides benchmarking of 20+ ANN algorithms on nine standard datasets with support to bring your dataset. ([Medium Post](https://medium.com/towards-artificial-intelligence/how-to-choose-the-best-nearest-neighbors-algorithm-8d75d42b16ab?sk=889bc0006f5ff773e3a30fa283d91ee7))
128+
</details>
129+
130+
<details>
131+
<summary><a href="https://github.com/facebookresearch/faiss">FAISS</a> - Efficient similarity search and clustering of dense vectors that possibly do not fit in RAM</summary>
132+
133+
> It is not the fastest ANN algorithm but achieves memory efficiency thanks to various quantization and indexing methods such as IVF, PQ, and IVF-PQ. ([Tutorial](https://www.pinecone.io/learn/faiss-tutorial/))
134+
</details>
135+
136+
<details>
137+
<summary><a href="https://github.com/nmslib/hnswlib">HNSW</a> - Hierarchical Navigable Small World graphs</summary>
138+
139+
> It is still one of the fastest ANN algorithms out there, requiring relatively a higher memory usage. (Paper: [Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs](https://arxiv.org/abs/1603.09320))
140+
</details>
141+
142+
<details>
143+
<summary><a href="https://github.com/google-research/google-research/tree/master/scann">Google's SCANN</a> - The technology behind vector search at Google</summary>
144+
145+
> Paper: [Accelerating Large-Scale Inference with Anisotropic Vector Quantization](https://arxiv.org/abs/1908.10396)
146+
</details>
147+
148+
123149
## Papers 🔬
150+
### Loss Functions
124151
<details>
125152
<summary><a href="http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf">Dimensionality Reduction by
126153
Learning an Invariant Mapping</a> - First appearance of Contrastive Loss.</summary>
@@ -176,6 +203,35 @@ Self-Supervised Learning</a> - Better regularization for high-dimensional embedd
176203

177204
</details>
178205

206+
### Computer Vision
207+
<details>
208+
<summary><a href="http://arxiv.org/abs/2002.05709">SimCLR: A Simple Framework for Contrastive Learning of Visual Representations</a> - Self-Supervised method comparing two differently augmented versions of the same image with Contrastive Loss</summary>
209+
210+
> It demonstrates among other things that
211+
> - composition of data augmentations plays a critical role - Random Crop + Random Color distortion provides the best downstream classifier accuracy,
212+
> - introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations,
213+
> - and Contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.
214+
</details>
215+
216+
### Natural Language Processing
217+
<details>
218+
<summary><a href="https://aclanthology.org/2021.emnlp-main.552">SimCSE: Simple Contrastive Learning of Sentence Embeddings</a> - An unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective, with only standard dropout used as noise.
219+
</summary>
220+
221+
> They also incorporates annotated pairs from natural language inference datasets into their contrastive learning framework in a supervised setting, showing that contrastive learning objective regularizes pre-trained embeddings’ anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.
222+
</details>
223+
224+
### Multi-Modal
225+
<details>
226+
<summary><a href="http://arxiv.org/abs/2103.00020">Learning Transferable Visual Models From Natural Language Supervision</a> - The paper that introduced CLIP: Training a unified vector embedding for image and text.
227+
</summary>
228+
</details>
229+
230+
<details>
231+
<summary><a href="http://arxiv.org/abs/2102.05918">Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision</a> - Google's answer to CLIP: Training a unified vector embedding for image and text but using noisy text instead of a carefully curated dataset.
232+
</summary>
233+
</details>
234+
179235

180236
## Datasets ℹ️
181237
> Practitioners can use any labeled or unlabelled data for metric learning with an appropriate method chosen. However, some datasets are particularly important in the literature for benchmarking or other ways, and we list them in this section.
@@ -210,3 +266,9 @@ serving as a useful benchmark.</summary>
210266

211267
> The dataset is published along with ["Deep Metric Learning via Lifted Structured Feature Embedding"](https://github.com/rksltnl/Deep-Metric-Learning-CVPR16) paper.
212268
</details>
269+
270+
<details>
271+
<summary><a href="https://www.drivendata.org/competitions/79/">MetaAI's 2021 Image Similarity Dataset and Challenge</a> - dataset has 1M Reference image set, 1M Training image set, 50K Dev query image set and 50K Test query image set</summary>
272+
273+
> The dataset is published along with ["The 2021 Image Similarity Dataset and Challenge"](http://arxiv.org/abs/2106.09672) paper.
274+
</details>

0 commit comments

Comments
 (0)