ML Breadth
Core ML concepts every engineer should be able to discuss fluently — from transformers to gradient descent.
0 / 67 studiedDeep Learning Fundamentals
Sequence Modeling
- Walk me through how transformers workmedium
- Explain how RNNs work and where they fit todaymedium
- Explain attention mechanisms beyond transformers — types and trade-offshard
- What is a KV cache, and why does it matter for LLM inference?hard
- How do long-context transformers work, and where do they fail?hard
Advanced Architectures
- Walk me through how modern object detection worksmedium
- Explain transfer learning — when to freeze vs fine-tune, and common failure modesmedium
- Walk me through two-tower models for retrieval — architecture, tradeoffs, and limitshard
- How does multi-task learning work for ranking — and what are the failure modes?hard
- How do mixture-of-experts models scale, and what can go wrong?hard
- When do you use a cross-encoder reranker instead of a bi-encoder?hard
- What is DSPy and when should you use it over manual prompting?hard
Classical ML
- Explain logistic regression and why it's still used in industryeasy
- Explain ensemble methods — random forests and gradient boostingmedium
- How do decision trees split? Walk me through Gini impurity vs information gain.medium
- GBDT vs neural networks for ranking — when do you pick each?medium
- Explain SVMs and kernels from an interviewer's point of viewmedium
- When would you use monotonic constraints in gradient boosted trees?hard
Optimization
- Walk me through gradient descent variantsmedium
- Compare SGD, Adam, and AdamW — when would you use each?medium
- Explain learning rate scheduling strategies and how to pick onemedium
- What is gradient clipping, and when should you use it?medium
- How does mixed precision training work, and why can it be unstable?hard
Loss Functions
- Explain cross-entropy loss — binary vs categorical, and why it works for classificationmedium
- Compare MSE, MAE, and Huber loss — when would you use each?medium
- What is focal loss and when would you use it over cross-entropy?hard
- What is KL divergence and how does it relate to cross-entropy?medium
- Explain contrastive and triplet loss — how do they train embedding spaces?hard
- Compare pointwise, pairwise, and listwise ranking losseshard
- How do label smoothing and knowledge distillation change cross-entropy training?hard
- When do you use Huber loss instead of MSE or MAE?medium
Feature Engineering
- Walk me through feature engineering for tabular datamedium
- How do you encode categorical features? Compare one-hot, target encoding, and embeddings.medium
- How do you handle class imbalance in ML?medium
- How do you approach feature selection?medium
- How do you handle missing data without introducing bias?hard
- When do feature crosses matter, and how do you control their complexity?medium
Data Integrity
- What is data leakage and how do you prevent it?medium
- What is train-serve skew and how do you prevent it?hard
- How do you handle the cold start problem in recommendation systems?hard
- Which features are safe to use when predicting future events — temporal leakage in practicehard
- How do delayed labels and censoring affect model training?hard
- What are point-in-time joins, and why do they prevent leakage?hard
Generalization & Regularization
- What is regularization and how do the different techniques compare?medium
- Explain the bias-variance tradeoffmedium
- Compare batch norm, layer norm, and group norm — when do you use each?hard
- Why don't huge overparameterized models always overfit?hard
- How do you handle domain shift and out-of-distribution inputs?hard
- How does data augmentation improve generalization, and when can it hurt?medium
Evaluation & Calibration
- Walk me through classification evaluation metrics — precision, recall, F1, AUCmedium
- What is model calibration and how do you fix a miscalibrated model?hard
- Explain NDCG — why use it over AUC for ranking?medium
- Your CTR model's calibration degrades months after launch — what causes it and how do you fix it?hard
- How do you evaluate a new ranker using logs from an old policy?hard
- How do you put confidence intervals on ML metrics?hard
- How do you run A/B tests for ML models in production?hard
Production ML
- What is position bias in ranking and how do you correct for it?hard
- Your offline metrics improved but A/B test shows nothing — how do you debug?hard
- Your model accuracy dropped 8% over 2 weeks — walk through your debugging processhard
- When is a large foundation model unjustified? Simple models vs LLMs.hard
- How do canary and shadow deployments reduce ML launch risk?hard
- When should an ML system use human review instead of full automation?hard