Description
Reinforcement learning deals with learning to make decisions under uncertainty. Recently, it has caught the attention of the public by achieving superhuman performance on Atari games, Go, Chess, and the Rubik's cube. With roots in industrial operations, it is now being used in many different fields, including chemistry, physics, robotics, and quantum computing. Facilitating this success is deep learning and search.
Deep learning is a class of powerful machine learning algorithms based on artificial neural networks. Using deep learning, we can train computers to recognize images, recognize speech, translate languages, solve problems in the natural sciences, and even generate new data. Most of this can be done using raw data, such as pixels. Reinforcement learning leverages deep learning for prediction, control, and learning how to model the world. The combination of deep learning and reinforcement learning is known as deep reinforcement learning.
Search allows one to "think" before making decisions by anticipating the outcomes of actions. Even for problems where the number of possible configurations is greater than what one could explore in a lifetime, we can use search to prioritize more promising configurations over less promising ones. In order to do this well, deep reinforcement learning has been used to learn which configurations to prioritize. Search has an essential role in many tasks, such as playing Go, solving the Rubik's cube, designing chemical structures, robotics, cryptography, and the optimization of artificial neural networks.
Schedule
Week 1: Introduction and Markov Decision Processes
Week 2: Dynamic Programming and Model-Free Prediction
Week 3: Monte Carlo Methods
Week 4: Temporal Difference Methods and Linear Function Approximation
Week 5: Deep Learning
Week 6: Approximate Value Iteration and A* Search
Week 7: Deep Q-Networks and Monte Carlo Tree Search
Week 8: Policy Gradients and Actor-Critic Methods
Week 9: Advanced Policy Gradients and Exploration vs Exploitation
Week 10: Model Based Reinforcement Learning and Search
Week 11: Partial Observability
Week 12: Gradient Free Methods and Meta Learning
Week 13: Paper Presentations
Week 14: Project Presentations
Week 15: Project Presentations
Web page was made with Mobirise