Implement gradient descent for linear regression from scratch

Question

Accepted Answer

Implement gradient descent for linear regression from scratch in NumPy. No sklearn, no autograd. Compute the gradient manually, implement the update rule, and show it converging. Then extend to mini-batch SGD. Think about: the MSE loss function and its gradient with respect to weights. What the closed-form gradient looks like for linear regression. How batch size affects the update — noise vs. speed tradeoff. What learning rate does and how to know if it's wrong. **The setup** Linear regression: