liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Reinforcement Learning Using Local Adaptive Models
Linköpings universitet, Institutionen för systemteknik, Bildbehandling. Linköpings universitet, Tekniska högskolan.ORCID-id: 0000-0002-9267-2191
1995 (engelsk)Licentiatavhandling, monografi (Annet vitenskapelig)
Abstract [en]

In this thesis, the theory of reinforcement learning is described and its relation to learning in biological systems is discussed. Some basic issues in reinforcement learning, the credit assignment problem and perceptual aliasing, are considered. The methods of temporal difference are described. Three important design issues are discussed: information representation and system architecture, rules for improving the behaviour and rules for the reward mechanisms. The use of local adaptive models in reinforcement learning is suggested and exemplified by some experiments. This idea is behind all the work presented in this thesis. A method for learning to predict the reward called the prediction matrix memory is presented. This structure is similar to the correlation matrix memory but differs in that it is not only able to generate responses to given stimuli but also to predict the rewards in reinforcement learning. The prediction matrix memory uses the channel representation, which is also described. A dynamic binary tree structure that uses the prediction matrix memories as local adaptive models is presented. The theory of canonical correlation is described and its relation to the generalized eigenproblem is discussed. It is argued that the directions of canonical correlations can be used as linear models in the input and output spaces respectively in order to represent input and output signals that are maximally correlated. It is also argued that this is a better representation in a response generating system than, for example, principal component analysis since the energy of the signals has nothing to do with their importance for the response generation. An iterative method for finding the canonical correlations is presented. Finally, the possibility of using the canonical correlation for response generation in a reinforcement learning system is indicated.

sted, utgiver, år, opplag, sider
Linköping, Sweden: Linköping University, Department of Electrical Engineering , 1995. , s. 119
Serie
Linköping Studies in Science and Technology. Thesis, ISSN 0280-7971 ; 507
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-53352Lokal ID: LiU-Tek-Lic-1995:39ISBN: 91-7871-590-3 (tryckt)OAI: oai:DiVA.org:liu-53352DiVA, id: diva2:288543
Presentation
(engelsk)
Tilgjengelig fra: 2010-01-21 Laget: 2010-01-20 Sist oppdatert: 2023-01-23

Open Access i DiVA

Reinforcement Learning Using Local Adaptive Models(908 kB)1035 nedlastinger
Filinformasjon
Fil FULLTEXT02.pdfFilstørrelse 908 kBChecksum SHA-512
9bb924079053f68bc4c94d7bea129f3bb147a05167f553b7f295d3f250b51e67b05bb83992f4f4c91901f763cde105815cfefc2d21126699f563fc720942d86a
Type fulltextMimetype application/pdf

Person

Borga, Magnus

Søk i DiVA

Av forfatter/redaktør
Borga, Magnus
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 1039 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 785 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf