liu.seSök publikationer i DiVA
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Safe Reinforcement Learning via a Model-Free Safety Certifier
Sharif Univ Technol, Iran.
Sharif Univ Technol, Iran.
Michigan State Univ, MI 48863 USA.
Linköpings universitet, Institutionen för systemteknik, Reglerteknik. Linköpings universitet, Tekniska fakulteten.ORCID-id: 0000-0002-6665-5881
Visa övriga samt affilieringar
2024 (Engelska)Ingår i: IEEE Transactions on Neural Networks and Learning Systems, ISSN 2162-237X, E-ISSN 2162-2388, Vol. 35, nr 3, s. 3302-3311Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

This article presents a data-driven safe reinforcement learning (RL) algorithm for discrete-time nonlinear systems. A data-driven safety certifier is designed to intervene with the actions of the RL agent to ensure both safety and stability of its actions. This is in sharp contrast to existing model-based safety certifiers that can result in convergence to an undesired equilibrium point or conservative interventions that jeopardize the performance of the RL agent. To this end, the proposed method directly learns a robust safety certifier while completely bypassing the identification of the system model. The nonlinear system is modeled using linear parameter varying (LPV) systems with polytopic disturbances. To prevent the requirement for learning an explicit model of the LPV system, data-based $\lambda$ -contractivity conditions are first provided for the closed-loop system to enforce robust invariance of a prespecified polyhedral safe set and the systems asymptotic stability. These conditions are then leveraged to directly learn a robust data-based gain-scheduling controller by solving a convex program. A significant advantage of the proposed direct safe learning over model-based certifiers is that it completely resolves conflicts between safety and stability requirements while assuring convergence to the desired equilibrium point. Data-based safety certification conditions are then provided using Minkowski functions. They are then used to seemingly integrate the learned backup safe gain-scheduling controller with the RL controller. Finally, we provide a simulation example to verify the effectiveness of the proposed approach.

Ort, förlag, år, upplaga, sidor
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC , 2024. Vol. 35, nr 3, s. 3302-3311
Nyckelord [en]
Data-driven control; gain-scheduling control; reinforcement learning (RL); safe control
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:liu:diva-193589DOI: 10.1109/TNNLS.2023.3264815ISI: 000973264800001PubMedID: 37053065OAI: oai:DiVA.org:liu-193589DiVA, id: diva2:1755871
Anmärkning

Funding Agencies|Excellence Centerat Linkoeping-Lund in Information Technology (ELLIIT); ZENITH

Tillgänglig från: 2023-05-09 Skapad: 2023-05-09 Senast uppdaterad: 2024-10-10Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextPubMed

Person

Adib Yaghmaie, Farnaz

Sök vidare i DiVA

Av författaren/redaktören
Adib Yaghmaie, Farnaz
Av organisationen
ReglerteknikTekniska fakulteten
I samma tidskrift
IEEE Transactions on Neural Networks and Learning Systems
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetricpoäng

doi
pubmed
urn-nbn
Totalt: 176 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf