For a search or recommendation ranker, how do pointwise, pairwise, and listwise losses differ? When would you use each?
formulate your answer, then —
tldr
Pointwise losses predict each item independently, pairwise losses learn relative order, and listwise losses optimize whole ranked lists. Choose based on the product metric, label quality, and whether scores need calibration. Ranking loss choice does not remove position or exposure bias.
follow-up
- How would you sample negatives for a pairwise recommender loss?
- Why is NDCG hard to optimize directly?
- When is pointwise log loss the right choice for a ranking system?