Course: Reinforcement Learning (WS 2018/19) | Autonomous Learning - Max Planck Institute for Intelligent Systems

Intelligent Systems - Reinforcement Learning

This lecture is part of the Intelligent Systems course series offered at the University of Tübingen by the MPI for Intelligent Systems.

Dates:

Fr 12c.t. - 14 Lecture in A301,

Fr. 14c.t. - 16 Recitation in A301.

Exam: Fr. 8th, February 2019 at 12c.t. - 16 - Informatik/Kriminologie - Hörsaal 1 F119.

See also the see also the entry in "Campus Verwaltung".

Course description:

The course will provide you with the theoretical and practical knowledge of reinforcement learning, a field of machine learning, that is suitable for robotic applications. We start with a brief overview of supervised learning, model selection etc, and spend the most time on reinforcement learning. The exercises will help you get hands-on with the methods and deepen your understanding.

People:

Instructor

Dr. Georg Martius

Instructor

Dr. Jia-Jie Zhu

Graduate Assistant

Sebastian Blaes

Georg edit cut small

Aaeaaqaaaaaaaadsaaaajdzkmdeyzti4ltezmzutnddhni1injiwltdlndninzc2zwvhmq

2017 07 01 blaes passbild 02

Course materials:

Lecture: Slides-1a, Slides-1b, Background reading: C.M. Bishop Pattern Recognition and Machine Learning, Chap. 3
Lecture: Slides-2, Slides-2 4on1, Background reading: C.M. Bishop Pattern Recognition and Machine Learning, Chap. 5
Lecture: Slides-3, Slides-3 4on1, Background reading: Sutton and Barto Reinforcement learning for the next few lectures
Lecture: Slides-4a, Slides-4a 4on1
1. Part b: Dynamic programming. Reading: chapter 4 of the Sutton & Barto textbook.
Lecture: Dynamic programming (continued)
Lecture: Prediction and Control Slides-7a, Slides-7b (will do part this and part next lecture)
Lecture: Prediction and Control (continued)
Lecture: Value function approximation.
1. Reading: Section 9.1-4, 9.5.1, 9.7, 9.8 of the Sutton & Barto textbook.
2. Supplementary material: DQN paper 1, paper 2, NFQ paper.
Lecture: DQN and Policy gradient: Slides-8, Slides-9
Lecture: Actor Critic
Lecture: Black Box Optimization Slides-11
Lecture on Monte Carlo Tree Search and AlphaGo
Lecture on optimal control and model-based methods: slid es

Final Project:

We updated the pdf with additional information concerning the report, including the minimum/maximum number of pages, and the submission deadline for the report and code.

The final project is decribed here: project.pdf (Version 4.1: 08.01.2019). The document will be updated. The environment code is at github.com/martius-lab/laser-hockey-env.

The client (v1.0: 06.01.2019) for the final tournament can be downloaded from here

Dates:

Course description:

People:

Course materials:

Final Project:

Related readings:

Other resources:

Latest News

Links

Contact Us