mlprep
mlprep/ML Breadthhard12 min

Your training data comes from user clicks on ranked results. What's wrong with training directly on this data, and how do you correct for position bias?

formulate your answer, then —

tldr

Position bias: users click high-position items more regardless of relevance, so clicks conflate relevance and position. Training directly on clicks teaches the model to predict position, not quality — creating self-reinforcing feedback loops. Fix with: Inverse Propensity Scoring (IPS) — weight clicks by 1/p(position); position-as-feature trick — include position in training, zero it out at serving; or randomized exploration to collect unbiased data. IPS is principled, position-as-feature is cheap and widely used.

follow-up

  • What is the position-as-feature trick in detail, and what assumptions does it make about how position affects clicks?
  • How would you estimate click propensity without running a full randomization experiment?
  • Beyond position, what other types of exposure bias exist in recommendation systems?