Bias and Variance - Brendan Shih

- bias: higher model assumption, underfit data - variance: lower model model assumption: overfit data - you "vary" by shaping around the data ![[CleanShot 2024-06-13 at [email protected]]] - to reduce variance/overfit: - [[Regularization]] - add more training data - [[Early Stopping]] - monitor performance on validation set, & stop training when performance starts deteriorating - k-fold cross validation - reduce [[Mini Batch Gradient Descent|Mini Batch]] size - to reduce bias/underfit: - increase model complexity - train longer, increase epochs - you always need to fit the data, but you just don't want to do it so much you learn weird unique anomalous kinks in the training data - high bias / underfit means bad performance - not enough fitting / learning - high variance means dev worse than train - too much fitting of unique train data attributes - here the optimal error is 0%, if it were 15% then the second column would be good ![[CleanShot 2024-06-13 at [email protected]|400]] - below is example of both high bias & variance - most parts underfit, so just bad performance in general - but specific part in middle overfit, does better on training than test, but is way too specialized & a bad learned behavior ![[CleanShot 2024-06-13 at [email protected]|200]] - in this now deep learning era there is not really as much of a "tradeoff" between bias & variance - Andrew Ng: notices that really good ML practitioners have a very sophisticated understanding of bias & variance