This lecture is part of the Intelligent Systems course series offered at the University of Tübingen by the MPI for Intelligent Systems.
Dates:
Fr 12c.t. - 14 Lecture in A301,
Fr. 14c.t. - 16 Recitation in A301.
Exam: Fr. 8th, February 2019 at 12c.t. - 16 - Informatik/Kriminologie - Hörsaal 1 F119.
See also the see also the entry in "Campus Verwaltung".
Course description:
The course will provide you with the theoretical and practical knowledge of reinforcement learning, a field of machine learning, that is suitable for robotic applications. We start with a brief overview of supervised learning, model selection etc, and spend the most time on reinforcement learning. The exercises will help you get hands-on with the methods and deepen your understanding.
People:
Instructor Dr. Georg Martius |
Instructor Dr. Jia-Jie Zhu |
Graduate Assistant Sebastian Blaes |
Course materials:
- Lecture: Slides-1a, Slides-1b, Background reading: C.M. Bishop Pattern Recognition and Machine Learning, Chap. 3
- Lecture: Slides-2, Slides-2 4on1, Background reading: C.M. Bishop Pattern Recognition and Machine Learning, Chap. 5
- Lecture: Slides-3, Slides-3 4on1, Background reading: Sutton and Barto Reinforcement learning for the next few lectures
- Lecture: Slides-4a, Slides-4a 4on1
- Part b: Dynamic programming. Reading: chapter 4 of the Sutton & Barto textbook.
- Lecture: Dynamic programming (continued)
- Lecture: Prediction and Control Slides-7a, Slides-7b (will do part this and part next lecture)
- Lecture: Prediction and Control (continued)
- Lecture: Value function approximation.
- Reading: Section 9.1-4, 9.5.1, 9.7, 9.8 of the Sutton & Barto textbook.
- Supplementary material: DQN paper 1, paper 2, NFQ paper.
- Lecture: DQN and Policy gradient: Slides-8, Slides-9
- Lecture: Actor Critic
- Lecture: Black Box Optimization Slides-11
- Lecture on Monte Carlo Tree Search and AlphaGo
- Lecture on optimal control and model-based methods: slides
Final Project:
We updated the pdf with additional information concerning the report, including the minimum/maximum number of pages, and the submission deadline for the report and code.
The final project is decribed here: project.pdf (Version 4.1: 08.01.2019). The document will be updated. The environment code is at github.com/martius-lab/laser-hockey-env.
The client (v1.0: 06.01.2019) for the final tournament can be downloaded from here
Related readings:
- Sutton & Barto, Reinforcement Learning: An Introduction
- Bertsekas, Dynamic Programming and Optimal Control, Vol. 1
- Bishop, Pattern Recognition and Machine Learning