Section 1 : Reinforcement learning
Commentary
Section Goals
- To introduce the basic concepts and methods of reinforcement learning.
- To discuss several algorithms and applications of reinforcement learning.
Learning Objectives
Learning Objective 1
- Outline the problem description and representation of reinforcement learning.
- Explain how reinforcement learning makes use of and calculates rewards and utilities.
- Describe the TD, ADP, and Q-learning algorithms.
- Discuss possible applications of reinforcement learning.
- Explain the following concepts or terms:
- Reinforcement learning
- Active learning
- Passive learning
- Q-learning
- Direct utility estimation
- Adaptive dynamic programming (ADP)
- Temporal difference (TD)
- Exploration function
Objective Readings
Required readings:
Reading topics:
Reinforcement Learning (see Chapter 21 of AIMA3ed)
Supplemental Readings
Sutton, R. S., and Barto, A. G. (1998). Reinforcement L: An introduction. Cambridge, MA: MIT Press.
Price, B., and Boutilier, C. (2003) Accelerating reinforcement learning through implicit imitation. Journal of Artificial Intelligence Research, 19, 569-629.
Objective Questions
- What are the main differences between the ADP and TD approaches, with respect to algorithms and performance?
Objective Activities
- Explore the Internet to find three papers about graphical models that are at your level of knowledge, and discuss them in the online course conference.
- Explore the following source code for reinforcement learning algorithms related to this section from the textbook's website.
- Passive-ADP-Agent
- Passive-TD-Agent
- Q-Learning-Agent
- Complete Exercise 21.2 of AIMA3ed.
- Complete Exercise 21.10 of AIMA3ed.