- bias: higher model assumption, underfit data
- variance: lower model model assumption: overfit data
- you "vary" by shaping around the data
![[CleanShot 2024-06-13 at
[email protected]]]
- to reduce variance/overfit:
- [[Regularization]]
- add more training data
- [[Early Stopping]]
- monitor performance on validation set, & stop training when performance starts deteriorating
- k-fold cross validation
- reduce [[Mini Batch Gradient Descent|Mini Batch]] size
- to reduce bias/underfit:
- increase model complexity
- train longer, increase epochs
- you always need to fit the data, but you just don't want to do it so much you learn weird unique anomalous kinks in the training data
- high bias / underfit means bad performance
- not enough fitting / learning
- high variance means dev worse than train
- too much fitting of unique train data attributes
- here the optimal error is 0%, if it were 15% then the second column would be good
![[CleanShot 2024-06-13 at
[email protected]|400]]
- below is example of both high bias & variance
- most parts underfit, so just bad performance in general
- but specific part in middle overfit, does better on training than test, but is way too specialized & a bad learned behavior
![[CleanShot 2024-06-13 at
[email protected]|200]]
- in this now deep learning era there is not really as much of a "tradeoff" between bias & variance
- Andrew Ng: notices that really good ML practitioners have a very sophisticated understanding of bias & variance