Today we'll be learning about the mathematical foundations of deep learning: Stochastic gradient descent (SGD), and the flexibility of linear functions layered with non-linear activation functions. We'll be focussing particularly on a popular combination called the Rectified linear function (ReLU).
Video
This lesson is based partly on chapter 4 of the book.
Resources
- Notebooks for this lesson:
- Other resources for the lesson
- Titanic spreadsheet: see the course repository
- Titanic data (training CSV) can be downloaded from Kaggle
- Solutions to chapter 4 questions from the book