- [[Deep Learning Models]]
- [[Deep Learning Model Components]]
- [[Deep Learning Training]]
- [[Data Preparation]]
## Algorithm
---
1. Define neural network structure (# of input units, # hidden units etc)
2. Initialize model's parameters
3. Loop:
- Implement forward propagation
- Compute loss
- Implement backward propagation to get gradients
- Update parameters
## General
---
- applied deep learning is very empirical process, very evidence based
- Anant Sahai says no one really knows why it works, like ancient alchemy
- when starting a new application, almost impossible to guess the right values for hyperparameters
- layers, hidden units, learning rates, activation functions etc.
- being big alone is not enough to perform well, we need it to be **deep**
- from circuit theory:
- there are functions you can compute with a "small" but deep neural network that shallower networks require exponentially more hidden units to compute
![[CleanShot 2024-06-11 at
[email protected]|350]]
![[CleanShot 2024-06-11 at
[email protected]|350]]
![[CleanShot 2024-06-10 at
[email protected]|400]]
- 2 computations in a node, z combo on left, then activation on right
![[CleanShot 2024-06-10 at
[email protected]|300]]
## Andrew Ng's Standardized Notation/Setup
---
![[deep-learning-notation.pdf]]
![[ACCE4963-52DB-4618-97F9-FF9A2570C02A_1_105_c.jpeg]]
Notation (Andrew Ng Coursera)
- note that by putting training rows as columns, it is much easier to code
- superscript $x^{(i)}, y^{(i)}, z^{(i)}$ refer to $ith$ training example out of the $m$ total samples
![[CleanShot 2024-06-05 at
[email protected]|500]]
- note that square superscript brackets refer to layer, old parentheses bracket refers to index of the $m$ training examples
![[CleanShot 2024-06-10 at
[email protected]|300]]