Submission

modified on 12/10/14

Abstract

We describe how an agent can dynamically and incrementally determine the structure of a value function from background knowledge as a side effect of problem solving. The agent determines the value function as it performs the task, using background knowledge in novel situations to compute an expected value for decision making. That expected value becomes the initial estimate of the value function, and the features tested by the background knowledge form the structure of the value function. This approach is implemented in Soar, using its existing mechanisms, relying on its preference-based decision-making, impasse-driven subgoaling, explanation-based rule-learning (chunking), and reinforcement learning. We evaluate this approach on a multiplayer dice game in which three different types of background knowledge are used.

Online Determination of Value-Function Structure and Action-value Estimates for Reinforcement Learning in a Cognitive Architecture

Abstract