liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Model-Based Actor-Critic for Multi-Objective Reinforcement Learning with Dynamic Utility Functions
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-4144-4893
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-9595-2471
2023 (English)In: Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2023, p. 2818-2820Conference paper, Published paper (Refereed)
Abstract [en]

Many real-world problems require a trade-off between multiple conflicting objectives. Decision-makers’ preferences over solutions to such problems are determined by their utility functions, which convert multi-objective values to scalars. In some settings, utility functions change over time, and the goal is to find methods that can efficiently adapt an agent’s policy to changes in utility. Previous work on learning with dynamic utility functions has focused on model-free methods, which often suffer from poor sample efficiency. In this work, we instead propose a model-based actor-critic, which explores with diverse utility functions through imagined rollouts within a learned world model between interactions with the real environment. An experimental evaluation on Minecart, a well-known benchmark for multi-objective reinforcement learning, shows that by learning a model of the environment the quality of the agent’s policy is improved compared to model-free algorithms.

Place, publisher, year, edition, pages
2023. p. 2818-2820
Keywords [en]
Multiple Objectives, Reinforcement Learning, Model-Based Learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-194554ISBN: 978-1-4503-9432-1 (electronic)OAI: oai:DiVA.org:liu-194554DiVA, id: diva2:1764315
Conference
International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
Funder
Vinnova, NFFP7/2017-04885Wallenberg AI, Autonomous Systems and Software Program (WASP)Available from: 2023-06-08 Created: 2023-06-08 Last updated: 2023-06-08

Open Access in DiVA

No full text in DiVA

Other links

Proceedings

Authority records

Källström, JohanHeintz, Fredrik

Search in DiVA

By author/editor
Källström, JohanHeintz, Fredrik
By organisation
Artificial Intelligence and Integrated Computer SystemsFaculty of Science & Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 128 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf