about make Triplet dataset #1

Kim-yonguk · 2019-12-11T03:34:59Z

Is this a way to make hard triplet online? Is it offline?

tamerthamoqa · 2019-12-11T11:19:52Z

I would say it is Online since you are only selecting the triplets in a batch that pass the hard-negatives triplet selection condition instead of pre-computing the number of triplets you want to train on that pass the condition by doing a full pass on the training set at the start of each epoch.

Please do keep in mind my understanding may be false. I think the triplet generation before training would not count as Offline as it is only randomly generating triplets and not pre-computing any embeddings that pass the triplet selection condition. I have used this triplet selection method from tbmoon's 'facenet' repository and edited it to provide a numpy file containing the generated triplets to provide some 'reproducibility' in experiments, but the general way I know of generating triplets is to randomly pick anchors, positives and negatives on the fly to prevent selection bias.

It seems you will need a large batch size to get better performance using the Triplet Loss method, so you will need a GPU with a large VRAM (24 GB or more preferably) or multiple GPUs in parallel. I think the original FaceNet paper used a batch size of 1800 triplets and enforced a certain number of images per each identity in their dataset (40 face images per identity) that contained hundreds of millions of face images and they used a Semi-Hard negative triplet selection method.

It seems doing only a normal cross-entropy loss classification on the VGGFace2 dataset using an Inception-ResNet-V1 model architecture like in David Sandberg's 'facenet' repository will yield better results with less instability during training, so giving that a shot wouldn't hurt.

If you find any more information please let me know.

Update test_model_lfw_far.py

AGenchev · 2021-02-16T20:41:32Z

Before we compute the embeddings, it is not known whether the negative in the triplet selected is Hard, Semi-Hard or Easy. The random generation before a pass might yield many "Easy" triplets. When these are fed into a "large" "mini"-batch to be evaluated during training (and only the Hard/Semi-hard selected), then we call it "Online".

tamerthamoqa pushed a commit that referenced this issue Jan 2, 2021

Merge pull request #1 from AGenchev/AGenchev-patch-1

6489e6a

Update test_model_lfw_far.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about make Triplet dataset #1

about make Triplet dataset #1

Kim-yonguk commented Dec 11, 2019

tamerthamoqa commented Dec 11, 2019 •

edited

Loading

AGenchev commented Feb 16, 2021

about make Triplet dataset #1

about make Triplet dataset #1

Comments

Kim-yonguk commented Dec 11, 2019

tamerthamoqa commented Dec 11, 2019 • edited Loading

AGenchev commented Feb 16, 2021

tamerthamoqa commented Dec 11, 2019 •

edited

Loading