Skip To Content

Athabasca University

Section 1 : Reinforcement learning

Commentary

Section Goals

  • To introduce the basic concepts and methods of reinforcement learning.
  • To discuss several algorithms and applications of reinforcement learning.

Learning Objectives

Learning Objective 1

  • Outline the problem description and representation of reinforcement learning.
  • Explain how reinforcement learning makes use of and calculates rewards and utilities.
  • Describe the TD, ADP, and Q-learning algorithms.
  • Discuss possible applications of reinforcement learning.
  • Explain the following concepts or terms:
    • Reinforcement learning
    • Active learning
    • Passive learning
    • Q-learning
    • Direct utility estimation
    • Adaptive dynamic programming (ADP)
    • Temporal difference (TD)
    • Exploration function

Objective Readings

Required readings:

Reading topics:

Reinforcement Learning (see Chapter 21 of AIMA3ed)

Supplemental Readings

Sutton, R. S., and Barto, A. G. (1998). Reinforcement L: An introduction. Cambridge, MA: MIT Press.

Price, B., and Boutilier, C. (2003) Accelerating reinforcement learning through implicit imitation. Journal of Artificial Intelligence Research, 19, 569-629.

Objective Questions

  • What are the main differences between the ADP and TD approaches, with respect to algorithms and performance?

Objective Activities

  • Explore the Internet to find three papers about graphical models that are at your level of knowledge, and discuss them in the online course conference.
  • Explore the following source code for reinforcement learning algorithms related to this section from the textbook's website.
    • Passive-ADP-Agent
    • Passive-TD-Agent
    • Q-Learning-Agent
  • Complete Exercise 21.2 of AIMA3ed.
  • Complete Exercise 21.10 of AIMA3ed.

Updated November 17 2015 by FST Course Production Staff