Explain contrastive and triplet loss — how do they train embedding spaces?

Question

Accepted Answer

What is contrastive loss and triplet loss? When would you use metric learning approaches over cross-entropy classification? Think about: what you're optimizing when you can't enumerate all classes at training time. What "similar pairs closer, dissimilar pairs farther" means geometrically. Why cross-entropy on fixed class labels fails for face verification where new people appear at inference time. What the margin hyperparameter does. **The motivation: open-set recognition** Cross-entropy trains