liu.seSearch for publications in DiVA
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reinforcement Learning for Partially Observable Linear Gaussian Systems Using Batch Dynamics of Noisy Observations
Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-6665-5881
Michigan State Univ, MI 48824 USA.
Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0003-3270-171X
2024 (English)In: IEEE Transactions on Automatic Control, ISSN 0018-9286, E-ISSN 1558-2523, Vol. 69, no 9, p. 6397-6404Article in journal (Refereed) Published
Abstract [en]

Reinforcement learning algorithms are commonly used to control dynamical systems with measurable state variables. If the dynamical system is partially observable, reinforcement learning algorithms are modified to compensate for the effect of partial observability. One common approach is to feed a finite history of input-output data instead of the state variable. In this article, we study and quantify the effect of this approach in linear Gaussian systems with quadratic costs. We coin the concept of L-Extra-Sampled-dynamics to formalize the idea of using a finite history of input-output data instead of state and show that this approach increases the average cost.

Place, publisher, year, edition, pages
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC , 2024. Vol. 69, no 9, p. 6397-6404
Keywords [en]
Costs; History; Noise; Dynamical systems; Noise measurement; Heuristic algorithms; Data models; Linear quadratic Gaussian; partiially observable dynamical systems; reinforcement learning
National Category
Control Engineering
Identifiers
URN: urn:nbn:se:liu:diva-207993DOI: 10.1109/TAC.2024.3385680ISI: 001302507600064OAI: oai:DiVA.org:liu-207993DiVA, id: diva2:1903271
Note

Funding Agencies|Wallenberg AI, Autonomous Systems and Software Program (WASP); Alice Wallenberg Foundation; ZENITH, Excellence Center at Linkoeping-Lund in Information Technology (EL-LIIT); Sensor informatics and Decision-making for the Digital Transformation (SEDDIT); Wallenberg AI, Autonomous Systems and Software Program (WASP); National Science Foundation [ECCS-2227311]; Vinnova Competence Center LINK-SIC; Scalable Kalman Filters project through the Swedish Research Council

Available from: 2024-10-03 Created: 2024-10-03 Last updated: 2024-10-03

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Search in DiVA

By author/editor
Adib Yaghmaie, FarnazGustafsson, Fredrik
By organisation
Automatic ControlFaculty of Science & Engineering
In the same journal
IEEE Transactions on Automatic Control
Control Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 341 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf