Compare bi-encoders and cross-encoders for search or recommendation. Why not use the more accurate cross-encoder everywhere?
formulate your answer, then —
tldr
Bi-encoders are fast because item embeddings are precomputed, but they miss rich query-item interactions. Cross-encoders are more accurate because they jointly encode both sides, but they are too expensive for full-corpus retrieval. Production systems usually use bi-encoder retrieval followed by cross-encoder reranking.
follow-up
- How would you debug a search system where the reranker looks strong but final NDCG is poor?
- What labels would you use to train a cross-encoder reranker?
- How do you choose how many candidates to pass from retrieval to reranking?