liu.seSök publikationer i DiVA
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Methods for Scalable and Safe Robot Learning
Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten. (AIICS)ORCID-id: 0000-0001-7248-1112
2017 (Engelska)Licentiatavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Robots are increasingly expected to go beyond controlled environments in laboratories and factories, to enter real-world public spaces and homes. However, robot behavior is still usually engineered for narrowly defined scenarios. To manually encode robot behavior that works within complex real world environments, such as busy work places or cluttered homes, can be a daunting task. In addition, such robots may require a high degree of autonomy to be practical, which imposes stringent requirements on safety and robustness. \setlength{\parindent}{2em}\setlength{\parskip}{0em}The aim of this thesis is to examine methods for automatically learning safe robot behavior, lowering the costs of synthesizing behavior for complex real-world situations. To avoid task-specific assumptions, we approach this from a data-driven machine learning perspective. The strength of machine learning is its generality, given sufficient data it can learn to approximate any task. However, being embodied agents in the real-world, robots pose a number of difficulties for machine learning. These include real-time requirements with limited computational resources, the cost and effort of operating and collecting data with real robots, as well as safety issues for both the robot and human bystanders.While machine learning is general by nature, overcoming the difficulties with real-world robots outlined above remains a challenge. In this thesis we look for a middle ground on robot learning, leveraging the strengths of both data-driven machine learning, as well as engineering techniques from robotics and control. This includes combing data-driven world models with fast techniques for planning motions under safety constraints, using machine learning to generalize such techniques to problems with high uncertainty, as well as using machine learning to find computationally efficient approximations for use on small embedded systems.We demonstrate such behavior synthesis techniques with real robots, solving a class of difficult dynamic collision avoidance problems under uncertainty, such as induced by the presence of humans without prior coordination. Initially using online planning offloaded to a desktop CPU, and ultimately as a deep neural network policy embedded on board a 7 quadcopter.

Ort, förlag, år, upplaga, sidor
Linköping: Linköping University Electronic Press, 2017. , s. 37
Serie
Linköping Studies in Science and Technology. Thesis, ISSN 0280-7971 ; 1780
Nyckelord [en]
Symbicloud, ELLIIT, WASP
Nationell ämneskategori
Data- och informationsvetenskap Datorseende och robotik (autonoma system)
Identifikatorer
URN: urn:nbn:se:liu:diva-138398DOI: 10.3384/lic.diva-138398ISBN: 9789176854907 (tryckt)OAI: oai:DiVA.org:liu-138398DiVA, id: diva2:1133724
Presentation
2017-09-15, Alan Turing, E-huset, Campus Valla, Linköping, 10:15 (Engelska)
Opponent
Handledare
Forskningsfinansiär
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsKnut och Alice Wallenbergs StiftelseStiftelsen för strategisk forskning (SSF)Tillgänglig från: 2017-08-17 Skapad: 2017-08-16 Senast uppdaterad: 2023-04-05Bibliografiskt granskad
Delarbeten
1. Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization
Öppna denna publikation i ny flik eller fönster >>Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization
2015 (Engelska)Ingår i: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI) / [ed] Blai Bonet and Sven Koenig, AAAI Press, 2015, s. 2497-2503Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Reinforcement learning for robot control tasks in continuous environments is a challenging problem due to the dimensionality of the state and action spaces, time and resource costs for learning with a real robot as well as constraints imposed for its safe operation. In this paper we propose a model-based reinforcement learning approach for continuous environments with constraints. The approach combines model-based reinforcement learning with recent advances in approximate optimal control. This results in a bounded-rationality agent that makes decisions in real-time by efficiently solving a sequence of constrained optimization problems on learned sparse Gaussian process models. Such a combination has several advantages. No high-dimensional policy needs to be computed or stored while the learning problem often reduces to a set of lower-dimensional models of the dynamics. In addition, hard constraints can easily be included and objectives can also be changed in real-time to allow for multiple or dynamic tasks. The efficacy of the approach is demonstrated on both an extended cart pole domain and a challenging quadcopter navigation task using real data.

Ort, förlag, år, upplaga, sidor
AAAI Press, 2015
Nyckelord
Reinforcement Learning, Gaussian Processes, Optimization, Robotics
Nationell ämneskategori
Datavetenskap (datalogi) Datorseende och robotik (autonoma system)
Identifikatorer
urn:nbn:se:liu:diva-113385 (URN)000485625502075 ()978-1-57735-698-1 (ISBN)
Konferens
Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI), January 25-30, 2015, Austin, Texas, USA.
Forskningsfinansiär
Linnaeus research environment CADICSELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsStiftelsen för strategisk forskning (SSF)VinnovaEU, FP7, Sjunde ramprogrammet
Tillgänglig från: 2015-01-16 Skapad: 2015-01-16 Senast uppdaterad: 2023-04-05Bibliografiskt granskad
2. Model-Predictive Control with Stochastic Collision Avoidance using Bayesian Policy Optimization
Öppna denna publikation i ny flik eller fönster >>Model-Predictive Control with Stochastic Collision Avoidance using Bayesian Policy Optimization
2016 (Engelska)Ingår i: IEEE International Conference on Robotics and Automation (ICRA), 2016, Institute of Electrical and Electronics Engineers (IEEE), 2016, s. 4597-4604Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Robots are increasingly expected to move out of the controlled environment of research labs and into populated streets and workplaces. Collision avoidance in such cluttered and dynamic environments is of increasing importance as robots gain more autonomy. However, efficient avoidance is fundamentally difficult since computing safe trajectories may require considering both dynamics and uncertainty. While heuristics are often used in practice, we take a holistic stochastic trajectory optimization perspective that merges both collision avoidance and control. We examine dynamic obstacles moving without prior coordination, like pedestrians or vehicles. We find that common stochastic simplifications lead to poor approximations when obstacle behavior is difficult to predict. We instead compute efficient approximations by drawing upon techniques from machine learning. We propose to combine policy search with model-predictive control. This allows us to use recent fast constrained model-predictive control solvers, while gaining the stochastic properties of policy-based methods. We exploit recent advances in Bayesian optimization to efficiently solve the resulting probabilistically-constrained policy optimization problems. Finally, we present a real-time implementation of an obstacle avoiding controller for a quadcopter. We demonstrate the results in simulation as well as with real flight experiments.

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers (IEEE), 2016
Serie
Proceedings of IEEE International Conference on Robotics and Automation, ISSN 1050-4729
Nyckelord
Robot Learning, Collision Avoidance, Robotics, Bayesian Optimization, Model Predictive Control
Nationell ämneskategori
Robotteknik och automation Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:liu:diva-126769 (URN)10.1109/ICRA.2016.7487661 (DOI)000389516203138 ()
Konferens
IEEE International Conference on Robotics and Automation (ICRA), 2016, Stockholm, May 16-21
Projekt
CADICSELLIITNFFP6CUASSHERPA
Forskningsfinansiär
Linnaeus research environment CADICSELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsEU, FP7, Sjunde ramprogrammetStiftelsen för strategisk forskning (SSF)
Tillgänglig från: 2016-04-04 Skapad: 2016-04-04 Senast uppdaterad: 2023-04-05Bibliografiskt granskad
3. Deep Learning Quadcopter Control via Risk-Aware Active Learning
Öppna denna publikation i ny flik eller fönster >>Deep Learning Quadcopter Control via Risk-Aware Active Learning
2017 (Engelska)Ingår i: Proceedings of The Thirty-first AAAI Conference on Artificial Intelligence (AAAI) / [ed] Satinder Singh and Shaul Markovitch, AAAI Press, 2017, Vol. 5, s. 3812-3818Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Modern optimization-based approaches to control increasingly allow automatic generation of complex behavior from only a model and an objective. Recent years has seen growing interest in fast solvers to also allow real-time operation on robots, but the computational cost of such trajectory optimization remains prohibitive for many applications. In this paper we examine a novel deep neural network approximation and validate it on a safe navigation problem with a real nano-quadcopter. As the risk of costly failures is a major concern with real robots, we propose a risk-aware resampling technique. Contrary to prior work this active learning approach is easy to use with existing solvers for trajectory optimization, as well as deep learning. We demonstrate the efficacy of the approach on a difficult collision avoidance problem with non-cooperative moving obstacles. Our findings indicate that the resulting neural network approximations are least 50 times faster than the trajectory optimizer while still satisfying the safety requirements. We demonstrate the potential of the approach by implementing a synthesized deep neural network policy on the nano-quadcopter microcontroller.

Ort, förlag, år, upplaga, sidor
AAAI Press, 2017
Serie
Proceedings of the AAAI Conference on Artificial Intelligence, ISSN 2159-5399, E-ISSN 2374-3468 ; 5
Nationell ämneskategori
Datorseende och robotik (autonoma system) Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:liu:diva-132800 (URN)000485630703119 ()978-1-57735-784-1 (ISBN)
Konferens
Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 2017, San Francisco, February 4–9.
Projekt
ELLIITCADICSNFFP6SYMBICLOUDCUGS
Forskningsfinansiär
Linnaeus research environment CADICSELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsEU, FP7, Sjunde ramprogrammetCUGS (National Graduate School in Computer Science)Stiftelsen för strategisk forskning (SSF)
Tillgänglig från: 2016-11-25 Skapad: 2016-11-25 Senast uppdaterad: 2023-04-05Bibliografiskt granskad

Open Access i DiVA

fulltext(4444 kB)1101 nedladdningar
Filinformation
Filnamn FULLTEXT03.pdfFilstorlek 4444 kBChecksumma SHA-512
4baaf11ce2255edd918cab4f9bdf747ca34061b3303c9a32b4e5f6fb702e580be92eef63ef7e5ec59d472b8ec64561f131d78ca1a7721808c7458b271da4ade4
Typ fulltextMimetyp application/pdf
omslag(2701 kB)174 nedladdningar
Filinformation
Filnamn COVER02.pdfFilstorlek 2701 kBChecksumma SHA-512
60dbc74e3dca50c3ac7e4afb89c4a2f82a72aa485d28687bfa77cff9d0096de829099888d57fd5ce8d29905dbb09f4ba62503f5894e23bcc18dcf820380d5692
Typ coverMimetyp application/pdf
Beställ online >>

Övriga länkar

Förlagets fulltext

Sök vidare i DiVA

Av författaren/redaktören
Andersson, Olov
Av organisationen
Artificiell intelligens och integrerade datorsystemTekniska fakulteten
Data- och informationsvetenskapDatorseende och robotik (autonoma system)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 1110 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 3380 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf