- for logistic regression: - $w \in \mathbb{R}^{n_x}, b \in \mathbb{R}$ - $J(w, b) = \frac{1}{m}\sum_{i=1}^{m} L(\hat{y}^{(i)}, y^{(i)}) + \frac{\lambda}{2m} ||w||_1$ - $||w||_1 = \sum^{n_x}_{i=1} |w|$ - L1 causes $w$ to be sparse, with a lot of 0's