liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning Hierarchical Policies by Iteratively Reducing the Width of Sketch Rules
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-1350-2144
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0002-2498-8020
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0001-9851-8219
2023 (English)In: 20th International Conference on Principles of Knowledge Representation and Reasoning, Rhodes, Greece, September 2-8, 2023, 2023Conference paper, Published paper (Refereed)
Abstract [en]

Hierarchical policies are a key ingredient of intelligent behavior, expressing the different levels of abstraction involved in the solution of a problem. Learning hierarchical policies, however, remains a challenge, as no general learning principles have been identified for this purpose, despite the broad interest and vast literature in both model-free reinforcement learning and model-based planning. In this work, we introduce a principled method for learning hierarchical policies over classical planning domains, with no supervision from small instances. The method is based on learning to decompose problems into subproblems so that the subproblems have a lower complexity as measured by their width. Problems and subproblems are captured by means of sketch rules, and the scheme for reducing the width of sketch rules is applied iteratively until the final sketch rules have zero width and encode a general policy. We evaluate the learning method on a number of classical planning domains, analyze the resulting hierarchical policies, and prove their properties. We also show that learning hierarchical policies by learning and refining sketches iteratively is often more efficient than learning flat general policies in one shot.

Place, publisher, year, edition, pages
2023.
Keywords [en]
classical planning, learning hierarchical policies, policy sketches language, planning width
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-196015OAI: oai:DiVA.org:liu-196015DiVA, id: diva2:1778289
Conference
20th International Conference on Principles of Knowledge Representation and Reasoning, Rhodes, Greece, September 2-8, 2023
Funder
Swedish National Infrastructure for Computing (SNIC), 2018-05973, 2022-06725National Supercomputer Centre (NSC), SwedenWallenberg AI, Autonomous Systems and Software Program (WASP)EU, Horizon 2020, 952215EU, European Research Council, 885107Available from: 2023-06-30 Created: 2023-06-30 Last updated: 2023-07-03

Open Access in DiVA

No full text in DiVA

Authority records

Drexler, DominikSeipp, JendrikGeffner, Hector

Search in DiVA

By author/editor
Drexler, DominikSeipp, JendrikGeffner, Hector
By organisation
Artificial Intelligence and Integrated Computer SystemsFaculty of Science & Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 150 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf