PhD Thesis #9123
|Classifier Systems for Situated Automomous Learning.
|The ability to learn from experience is a key aspect of intelligence. Incorporating this ability into a computer is a formidable problem. Genetic algorithms coupled to learning classifier system are powerful tools for tackling this task. While genetic algorithms can be shown to be near optimal solutions for the search task they perform, no similar proof exists for classifier systems. My research investigated two aspects of classifier systems, classifier selection and credit assignment. Explicit world models, look ahead and incremental planning are incorporated into the classifier system framework in order to make use of more of the information available to the system, and a more sophisticated approach to credit assignment is attempted. The investigation involved the construction of four different classifier systems, and testing each of these systems in three separate virtual worlds. Wilson's Animat research was carefully reconstructed, and used the control in a scientific experiment testing the efficacy of the various strategies embodied in three experimental systems. The three experimental classifier systems all contained explicit world models and lookahead. One was an extension of Wilson's Animat, the other two involved an entirely new credit assignment scheme inspired by Watkins's Q-learning technique. Use of this technique enabled the incorporation of an incremental planner, similar to Sutton's Dyna-Q research, into one of the classifier systems, distinguishing it from the other Q-learning based classifier system. The research shows that use of explicit world models and lookahead significantly decreases the time required in order to discover paths to well rewarded goals. It also shows that incremental planning can be used to further increase learning speed. While the experimental classifier systems were quick at discovery, they did not necessarily exploit these discoveries. Because of this, the performan
|NO ONLINE COPY