liu.seSearch for publications in DiVA
Change search
ReferencesLink to record
Permanent link

Direct link
Reinforcement Learning Using Local Adaptive Models
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.ORCID iD: 0000-0002-9267-2191
1995 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

In this thesis, the theory of reinforcement learning is described and its relation to learning in biological systems is discussed. Some basic issues in reinforcement learning, the credit assignment problem and perceptual aliasing, are considered. The methods of temporal difference are described. Three important design issues are discussed: information representation and system architecture, rules for improving the behaviour and rules for the reward mechanisms. The use of local adaptive models in reinforcement learning is suggested and exemplified by some experiments. This idea is behind all the work presented in this thesis. A method for learning to predict the reward called the prediction matrix memory is presented. This structure is similar to the correlation matrix memory but differs in that it is not only able to generate responses to given stimuli but also to predict the rewards in reinforcement learning. The prediction matrix memory uses the channel representation, which is also described. A dynamic binary tree structure that uses the prediction matrix memories as local adaptive models is presented. The theory of canonical correlation is described and its relation to the generalized eigenproblem is discussed. It is argued that the directions of canonical correlations can be used as linear models in the input and output spaces respectively in order to represent input and output signals that are maximally correlated. It is also argued that this is a better representation in a response generating system than, for example, principal component analysis since the energy of the signals has nothing to do with their importance for the response generation. An iterative method for finding the canonical correlations is presented. Finally, the possibility of using the canonical correlation for response generation in a reinforcement learning system is indicated.

Place, publisher, year, edition, pages
Linköping, Sweden: Linköping University, Department of Electrical Engineering , 1995. , 119 p.
Linköping Studies in Science and Technology. Thesis, ISSN 0280-7971 ; 507
National Category
Engineering and Technology
URN: urn:nbn:se:liu:diva-53352Local ID: LiU-Tek-Lic-1995:39ISBN: 91-7871-590-3OAI: diva2:288543
Available from: 2010-01-21 Created: 2010-01-20 Last updated: 2014-10-08

Open Access in DiVA

Reinforcement Learning Using Local Adaptive Models(908 kB)481 downloads
File information
File name FULLTEXT02.pdfFile size 908 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Borga, Magnus
By organisation
Computer VisionThe Institute of Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 485 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 531 hits
ReferencesLink to record
Permanent link

Direct link