Deep Reinforcement Learning and Search

Description

Reinforcement learning deals with learning to make decisions under uncertainty. Recently, it has caught the attention of the public by achieving superhuman performance on Atari games, Go, Chess, and the Rubik's cube. With roots in industrial operations, it is now being used in many different fields, including chemistry, physics, robotics, and quantum computing. Facilitating this success is deep learning and search.

Deep learning is a class of powerful machine learning algorithms based on artificial neural networks. Using deep learning, we can train computers to recognize images, recognize speech, translate languages, solve problems in the natural sciences, and even generate new data. Most of this can be done using raw data, such as pixels. Reinforcement learning leverages deep learning for prediction, control, and learning how to model the world. The combination of deep learning and reinforcement learning is known as deep reinforcement learning.

Search allows one to "think" before making decisions by anticipating the outcomes of actions. Even for problems where the number of possible configurations is greater than what one could explore in a lifetime, we can use search to prioritize more promising configurations over less promising ones. In order to do this well, deep reinforcement learning has been used to learn which configurations to prioritize. Search has an essential role in many tasks, such as playing Go, solving the Rubik's cube, designing chemical structures, robotics, cryptography, and the optimization of artificial neural networks.

Schedule

Week 1: Introduction and Markov Decision Processes

Week 2: Dynamic Programming and Model-Free Prediction

Week 3: Monte Carlo Methods

Week 4: Temporal Difference Methods and Linear Function Approximation

Week 5: Deep Learning

Week 6: Approximate Value Iteration and A* Search

Week 7: Deep Q-Networks and Monte Carlo Tree Search

Week 8: Policy Gradients and Actor-Critic Methods

Week 9: Advanced Policy Gradients and Exploration vs Exploitation

Week 10: Model Based Reinforcement Learning and Search

Week 11: Partial Observability

Week 12: Gradient Free Methods and Meta Learning

Week 13: Paper Presentations

Week 14: Project Presentations

Week 15: Project Presentations