Explain the bias-variance tradeoff. How does it manifest in practice and what do you do when a model has high bias vs high variance?
formulate your answer, then —
tldr
Bias = systematic error from wrong assumptions; variance = sensitivity to training data fluctuations. Bias² + Variance + Noise = total expected error. High bias → underfit: train error high, gap small. High variance → overfit: train error low, gap large. Fix bias with more capacity/features; fix variance with more data/regularization. Deep learning's double descent complicates the classical tradeoff at scale.
follow-up
- What is double descent and why does it challenge the classical bias-variance view?
- How does the choice of k in k-fold cross-validation relate to bias and variance in the estimator of generalization performance?
- Bagging reduces variance without increasing bias — why? And why does boosting primarily reduce bias?