Jefferson Provost's Publications

Change sort order:

Reinforcement Learning in High-Diameter, Continuous Environments

Jefferson Provost. Reinforcement Learning in High-Diameter, Continuous Environments. Ph.D. Thesis, The University of Texas at Austin, Austin, Texas, 2007.

Download

[PDF]2.1MB  

Abstract

Many important real-world robotic tasks have high diameter, that is, their solution requires a large number of primitive actions by the robot. For example, they may require navigating to distant locations using primitive motor control commands. In addition, modern robots are endowed with rich, high-dimensional sensory systems, providing measurements of a continuous environment. Reinforcement learning (RL) has shown promise as a method for automatic learning of robot behavior, but current methods work best on low-diameter, low-dimensional tasks. Because of this problem, the success of RL on real-world tasks still depends on human analysis of the robot, environment, and task to provide a useful set of perceptual features and an appropriate decomposition of the task into subtasks.This thesis presents Self-Organizing Distinctive-state Abstraction(SODA) as a solution to this problem. Using SODA a robot with littleprior knowledge of its sensorimotor system, environment, and task canautomatically reduce the effective diameter of its tasks. First ituses a self-organizing feature map to learn higher level perceptualfeatures while exploring using primitive, local actions. Then, usingthe learned features as input, it learns a set of high-level actionsthat carry the robot between perceptually distinctive states in theenvironment.Experiments in two robot navigation environments demonstrate that SODAlearns useful features and high-level actions, that using these newactions dramatically speeds up learning for high-diameter navigationtasks, and that the method scales to large (building-sized) robotenvironments. These experiments demonstrate SODAs effectiveness as ageneric learning agent for mobile robot navigation, pointing the waytoward developmental robots that learn to understand themselves andtheir environments through experience in the world, reducing the needfor human engineering for each new robotic application.

BibTeX

@PhdThesis{provost-phd07,
  author = 	 {Jefferson Provost},
  title = 	 {Reinforcement Learning in High-Diameter, Continuous Environments},
  school = 	 {The University of Texas at Austin},
  year = 	 2007,
  type =	 {{Ph.D.} Dissertation},
  address =	 {Austin, Texas},
  month =	 {August},
  abstract = {Many important real-world robotic tasks have high
  diameter, that is, their solution requires a large number of
  primitive actions by the robot.  For example, they may require
  navigating to distant locations using primitive motor control
  commands.  In addition, modern robots are endowed with rich,
  high-dimensional sensory systems, providing measurements of a
  continuous environment.  Reinforcement learning (RL) has shown
  promise as a method for automatic learning of robot behavior, but
  current methods work best on low-diameter, low-dimensional tasks.
  Because of this problem, the success of RL on real-world tasks still
  depends on human analysis of the robot, environment, and task to
  provide a useful set of perceptual features and an appropriate
  decomposition of the task into subtasks.
This thesis presents Self-Organizing Distinctive-state Abstraction
(SODA) as a solution to this problem. Using SODA a robot with little
prior knowledge of its sensorimotor system, environment, and task can
automatically reduce the effective diameter of its tasks.  First it
uses a self-organizing feature map to learn higher level perceptual
features while exploring using primitive, local actions.  Then, using
the learned features as input, it learns a set of high-level actions
that carry the robot between perceptually distinctive states in the
environment.
Experiments in two robot navigation environments demonstrate that SODA
learns useful features and high-level actions, that using these new
actions dramatically speeds up learning for high-diameter navigation
tasks, and that the method scales to large (building-sized) robot
environments.  These experiments demonstrate SODAs effectiveness as a
generic learning agent for mobile robot navigation, pointing the way
toward developmental robots that learn to understand themselves and
their environments through experience in the world, reducing the need
for human engineering for each new robotic application.}
}

Up to Jefferson Provost's Research Page

Generated by bib2html.pl (written by Patrick Riley ) on Fri Nov 02, 2007 19:46:51