Deep Reinforcement Learning

The smartest combination of Deep Q-Learning, Policy Gradient, Actor Critic, and DDPG

Last updated 2022-01-10 | 4.5

- Q-Learning
- Deep Q-Learning
- Policy Gradient

What you'll learn

Q-Learning
Deep Q-Learning
Policy Gradient
Actor Critic
Deep Deterministic Policy Gradient (DDPG)
Twin-Delayed DDPG (TD3)
The Foundation Techniques of Deep Reinforcement Learning
How to implement a state of the art AI model that is over performing the most challenging virtual applications

* Requirements

* Some maths basics like knowing what is a differentiation or a gradient
* A bit of programming knowledge (classes and objects)

Description

Welcome to Deep Reinforcement Learning 2.0!

In this course, we will learn and implement a new incredibly smart AI model, called the Twin-Delayed DDPG, which combines state of the art techniques in Artificial Intelligence including continuous Double Deep Q-Learning, Policy Gradient, and Actor Critic. The model is so strong that for the first time in our courses, we are able to solve the most challenging virtual AI applications (training an ant/spider and a half humanoid to walk and run across a field).

To approach this model the right way, we structured the course in three parts:

  • Part 1: Fundamentals
    In this part we will study all the fundamentals of Artificial Intelligence which will allow you to understand and master the AI of this course. These include Q-Learning, Deep Q-Learning, Policy Gradient, Actor-Critic and more.

  • Part 2: The Twin-Delayed DDPG Theory
    We will study in depth the whole theory behind the model. You will clearly see the whole construction and training process of the AI through a series of clear visualization slides. Not only will you learn the theory in details, but also you will shape up a strong intuition of how the AI learns and works. The fundamentals in Part 1, combined to the very detailed theory of Part 2, will make this highly advanced model accessible to you, and you will eventually be one of the very few people who can master this model.

  • Part 3: The Twin-Delayed DDPG Implementation
    We will implement the model from scratch, step by step, and through interactive sessions, a new feature of this course which will have you practice on many coding exercises while we implement the model. By doing them you will not follow passively the course but very actively, therefore allowing you to effectively improve your skills. And last but not least, we will do the whole implementation on Colaboratory, or Google Colab, which is a totally free and open source AI platform allowing you to code and train some AIs without having any packages to install on your machine. In other words, you can be 100% confident that you press the execute button, the AI will start to train and you will get the videos of the spider and humanoid running in the end.

Who this course is for:

  • Data Scientists who want to take their AI Skills to the next level
  • AI experts who want to expand on the field of applications
  • Engineers who work in technology and automation
  • Businessmen and companies who want to get ahead of the game
  • Students in tech-related programs who want to pursue a career in Data Science, Machine Learning, or Artificial Intelligence
  • Anyone passionate about Artificial Intelligence

Course content

8 sections • 63 lectures

Welcome Preview 15:12

Some resources before we start Preview 00:37

BONUS: Learning Path Preview 00:33

Q-Learning Preview 10:25

Deep Q-Learning Preview 06:54

Policy Gradient Preview 06:35

Actor-Critic Preview 04:05

Taxonomy of AI models Preview 07:48

BONUS: 5 Advantages of DRL Preview 00:45

BONUS: RL Algorithms Map Preview 00:32

Get the materials Preview 00:04

Introduction and Initialization Preview 14:27

The Q-Learning part Preview 18:42

The Policy Learning part Preview 13:39

The whole training process Preview 03:51

The whole code folder of the course with all the implementations Preview 00:18

Beginning Preview 05:36

Implementation - Step 1 Preview 15:46

Implementation - Step 2 Preview 15:12

Implementation - Step 3 Preview 13:55

Implementation - Step 4 Preview 14:09

Implementation - Step 5 Preview 11:03

Implementation - Step 6 Preview 09:43

Implementation - Step 7 Preview 04:26

Implementation - Step 8 Preview 07:44

Implementation - Step 9 Preview 03:55

Implementation - Step 10 Preview 04:08

Implementation - Step 11 Preview 07:33

Implementation - Step 12 Preview 04:06

Implementation - Step 13 Preview 05:31

Implementation - Step 14 Preview 06:54

Implementation - Step 15 Preview 14:20

Implementation - Step 16 Preview 08:54

Implementation - Step 17 Preview 06:11

Implementation - Step 18 Preview 13:30

Implementation - Step 19 Preview 11:46

Implementation - Step 20 Preview 05:11

Plan of Attack Preview 02:51

The Neuron Preview 16:15

The Activation Function Preview 08:29

How do Neural Networks Work? Preview 12:47

How do Neural Networks Learn? Preview 12:58

Gradient Descent Preview 10:12

Stochastic Gradient Descent Preview 08:44

Backpropagation Preview 05:21

Plan of Attack Preview 04:04

What is Reinforcement Learning? Preview 11:26

The Bellman Equation Preview 18:25

The Plan Preview 02:12

Markov Decision Process Preview 16:27

Policy vs Plan Preview 12:55

Living Penalty Preview 09:47

Q-Learning Intuition Preview 14:45

Temporal Difference Preview 19:27

Q-Learning Visualization Preview 13:31

Plan of Attack Preview 02:17

Deep Q-Learning Intuition - Step 1 Preview 15:15

Deep Q-Learning Intuition - Step 2 Preview 06:06

Experience Replay Preview 15:45

Action Selection Policies Preview 16:23