Sunday, March 4, 2012

Balancing Exploration and Exploitation in Agent Learning

Balancing Exploration and Exploitation in Agent Learning Review



Controlling the ratio of exploration and exploitation in agent learning in dynamic environments is a continuing challenge in applying agent-learning techniques. Methods to control this ratio in a manner that mimics human behavior are required for use in the representation of human behavior in simulations, where the goal is to constrain agent-learning mechanisms in a manner similar to that observed in human cognition. The Cultural Geography (CG) model, under development in TRAC Monterey, is an agent-based social simulation. It simulates a wide variety of situations and scenarios so that a dynamic ratio between exploration and exploitation makes the decisions more sensible. As part of an attempt to improve the model, this thesis investigates enhancements to the exploration-exploitation balance by using different techniques. The work includes design of experiments with a range of factors in multiple environments and statistical analysis related to these experiments. As a main finding from this research, for small environments and for short runs techniques based on subjective utility give better results, while for long runs techniques based on time obtain higher utilities than other techniques. In more complex and bigger environments, a combined technique performed better in long runs.


No comments:

Post a Comment