liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reinforcement Learning of Locomotion based on Central Pattern Generators
Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Locomotion learning for robotics is an interesting and challenging area in which the movement capabilities of animals have been deeply investigated and acquired knowledge has been transferred into modelling locomotion on robots. What modellers are required to understand is what structure can represent locomotor systems in different animals and how such animals develop various and dexterous locomotion capabilities. Notwithstanding the depth of research in the area, modelling locomotion requires a deep rethinking.

In this thesis, based on the umbrella of embodied cognition, a neural-body-environment interaction is emphasised and regarded as the solution to locomotion learning/development. Central pattern generators (CPGs) are introduced in the first part (Chapter 2) to generally interpret the mechanism of locomotor systems in animals. With a deep investigation on the structure of CPGs and inspiration from human infant development, a layered CPG architecture with baseline motion generation and dynamics adaptation interfaces are proposed. In the second part, reinforcement learning (RL) is elucidated as a good method for dealing with locomotion learning from the perspectives of psychology, neuroscience and robotics (Chapter 4). Several continuous-space RL techniques (e.g. episodic natural actor critic, policy learning by weighting explorations with returns, continuous action space learning automaton are introduced for practical use (Chapter 3). With the knowledge of CPGs and RL, the architecture and concept of CPG-Actor-Critic is constructed. Finally, experimental work based on published papers is highlighted in a path of my PhD research (Chapter 5). This includes the implementation of CPGs and the learning on the NAO robot for crawling and walking. The implementation is also extended to test the generalizability to different morphologies (the ghostdog robot). The contribution of this thesis is discussed from two angles: the investigation of the CPG architecture and the implementation (Chapter 6).

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2014. , 71 p.
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 1602
National Category
Computer Science
Identifiers
URN: urn:nbn:se:liu:diva-105884ISBN: 978-91-7519-313-7 (print)OAI: oai:DiVA.org:liu-105884DiVA: diva2:712601
Public defence
2014-06-04, G110, hus G, Högskolan i Skövde, Skövde, 12:30 (English)
Opponent
Supervisors
Available from: 2014-05-22 Created: 2014-04-11 Last updated: 2014-05-22Bibliographically approved
List of papers
1. Humanoids that crawl: Comparing gait performance of iCub and NAO using a CPG architecture
Open this publication in new window or tab >>Humanoids that crawl: Comparing gait performance of iCub and NAO using a CPG architecture
2011 (English)In: Proceedings - 2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011 / [ed] Shaozi Li, Ying Dai, IEEE conference proceedings , 2011, 577-582 p.Conference paper, Published paper (Refereed)
Abstract [en]

In this article, a generic CPG architecture is used to model infant crawling gaits and is implemented on the NAO robot platform. The CPG architecture is chosen via a systematic approach to designing CPG networks on the basis of group theory and dynamic systems theory. The NAO robot performance is compared to the iCub robot which has a different anatomical structure. Finally, the comparison of performance and NAO whole-body stability are assessed to show the adaptive property of the CPG architecture and the extent of its ability to transfer to different robot morphologies. © 2011 IEEE.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2011
Series
Proceedings - 2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011, 4
Keyword
CPG, Crawling, iCub, Infant development, NAO, Algebra, Network architecture, Robots, System theory, Computer architecture
National Category
Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-106772 (URN)10.1109/CSAE.2011.5952916 (DOI)2-s2.0-80051897170 (Scopus ID)9781424487257 (ISBN)
Conference
2011 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2011, 10 June 2011 through 12 June 2011, Shanghai
Available from: 2013-02-19 Created: 2014-05-22 Last updated: 2014-05-22Bibliographically approved
2. Modelling Early Infant Walking: Testing a Generic CPG Architecture on the NAO Humanoid
Open this publication in new window or tab >>Modelling Early Infant Walking: Testing a Generic CPG Architecture on the NAO Humanoid
2011 (English)In: IEEE International Conference on Development and Learning (ICDL), 2011, IEEE conference proceedings , 2011, 1-6 p.Conference paper, Published paper (Refereed)
Abstract [en]

In this article, a simple CPG network is shown to model early infant walking, in particular the onset of independent walking. The difference between early infant walking and early adult walking is addressed with respect to the underlying neurophysiology and evaluated according to gait attributes. Based on this, we successfully model the early infant walking gait on the NAO robot and compare its motion dynamics and performance to those of infants. Our model is able to capture the core properties of early infant walking. We identify differences in the morphologies between the robot and infant and the effect of this on their respective performance. In conclusion, early infant walking can be seen to develop as a function of the CPG network and morphological characteristics.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2011
Series
IEEE International Conference on Development and Learning (ICDL), ISSN 2161-9476 ; Vol. 2
Keyword
Early Walking, CPG, Morphology, Development, NAO
National Category
Computer Science
Identifiers
urn:nbn:se:liu:diva-106778 (URN)10.1109/DEVLRN.2011.6037318 (DOI)000297472300007 ()2-s2.0-80055009067 (Scopus ID)978-1-61284-989-8 (ISBN)
Conference
The 2011 IEEE International Conference on Development and Learning, ICDL 2011; Frankfurt am Main; 24-27 August 2011, Category number CFP11294-ART; Code 87020
Available from: 2013-02-18 Created: 2014-05-22 Last updated: 2014-05-22
3. Modelling Walking Behaviors Based on CPGs: A Simplified Bio-inspired Architecture
Open this publication in new window or tab >>Modelling Walking Behaviors Based on CPGs: A Simplified Bio-inspired Architecture
2012 (English)In: From Animals to Animats 12: 12th International Conference on Simulation of Adaptive Behavior, SAB 2012Odense, Denamark, August 27-30, 2012 / [ed] Tom Ziemke, Christian Balkenius, John Hallam, Berlin, Heidelberg: Springer Berlin/Heidelberg , 2012, 156-166 p.Conference paper, Published paper (Refereed)
Abstract [en]

In this article, we use a recurrent neural network including four-cell core architecture to model the walking gait and implement it with the simulated and physical NAO robot. Meanwhile, inspired by the biological CPG models, we propose a simplified CPG model which comprises motorneurons, interneurons, sensor neurons and the simplified spinal cord. Within this model, the CPGs do not directly output trajectories to the servo motors. Instead, they only work to maintain the phase relation among ipsilateral and contralateral limbs. The final output is dependent on the integration of CPG signals, outputs of interneurons, motor neurons and sensor neurons (sensory feedback).

Place, publisher, year, edition, pages
Berlin, Heidelberg: Springer Berlin/Heidelberg, 2012
Series
Lecture Notes in Computer Science, ISSN 0302-9743 (print), 1611-3349 (online) ; 7426
Keyword
CPGs, the NAO robot, Interneuron, Motorneuron
National Category
Computer Science
Research subject
Technology
Identifiers
urn:nbn:se:liu:diva-106771 (URN)10.1007/978-3-642-33093-3_16 (DOI)2-s2.0-84866033575 (Scopus ID)978-3-642-33092-6 (ISBN)978-3-642-33093-3 (ISBN)
Conference
12th International Conference on Simulation of Adaptive Behavior, SAB 2012, Odense, Denmark, August 27-30, 2012
Available from: 2012-10-31 Created: 2014-05-22 Last updated: 2017-02-16Bibliographically approved
4. Humanoids learning to walk: a natural CPG-actor-critic architecture
Open this publication in new window or tab >>Humanoids learning to walk: a natural CPG-actor-critic architecture
2013 (English)In: Frontiers in Neurorobotics, ISSN 1662-5218, Vol. 7, no 5Article in journal (Refereed) Published
Abstract [en]

The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it to the case of a humanoid (NAO) robot learning to walk the ability of which emerges from its value-based interaction with the environment. In the model, a simplified central pattern generator (CPG) architecture inspired by neuroscientific research and DST is integrated with an actor-critic approach to RL (cpg-actor-critic). In the cpg-actor-critic architecture, least-square-temporal-difference based learning converges to the optimal solution quickly by using natural gradient learning and balancing exploration and exploitation. Futhermore, rather than using a traditional (designer-specified) reward it uses a dynamic value function as a stability indicator that adapts to the environment. The results obtained are analyzed using a novel DST-based embodied cognition approach. Learning to walk, from this perspective, is a process of integrating levels of sensorimotor activity and value.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2013
Keyword
reinforcement learning, humanoid walking, central pattern generators, actor-critic, dynamical systems theory, embodied cognition, value system
National Category
Computer and Information Science
Research subject
Technology
Identifiers
urn:nbn:se:liu:diva-106770 (URN)10.3389/fnbot.2013.00005 (DOI)23675345 (PubMedID)
Available from: 2013-08-08 Created: 2014-05-22 Last updated: 2017-12-05Bibliographically approved
5. Crawling Posture Learning in Humanoid Robots using a Natural-Actor-Critic CPG Architecture
Open this publication in new window or tab >>Crawling Posture Learning in Humanoid Robots using a Natural-Actor-Critic CPG Architecture
2013 (English)In: Advances in Artificial Life, ECAL 2013, Proceedings of the Twelfth European Conference on the Synthesis and Simulation of Living Systems / [ed] Pietro Liò, Orazio Miglino, Giuseppe Nicosia, Stefano Nolfi and Mario Pavone, MIT Press, 2013, 1182-1190 p.Conference paper, Published paper (Refereed)
Abstract [en]

In this article, a four-cell CPG network, exploiting sensory feedback, is proposed in order to emulate infant crawling gaits when utilized on the NAO robot. Based on the crawling model, the positive episodic natural-actor-critic architecture is applied to learn a proper posture of crawling on a simulated NAO. By transferring the learned results to the physical NAO, the transferability from simulation to physical world is discussed. Finally, a discussion pertaining to locomotion learning based on dynamic system theory is given in the conclusion.

Place, publisher, year, edition, pages
MIT Press, 2013
National Category
Computer Science
Identifiers
urn:nbn:se:liu:diva-106774 (URN)978-0-262-31719-2 (ISBN)
Conference
The Twelfth European Conference on the Synthesis and Simulation of Living Systems (ECAL 2013), 2-6 September 2013, Taormina, Italy
Available from: 2014-05-22 Created: 2014-05-22 Last updated: 2014-05-26Bibliographically approved
6. Humanoids learning to crawl based on Natural CPG-Actor-Critic and Motor Primitives
Open this publication in new window or tab >>Humanoids learning to crawl based on Natural CPG-Actor-Critic and Motor Primitives
2013 (English)In: Proceedings of IROS 2013 Workshop on Neuroscience and Robotics, Tokyo, Japan / [ed] Emre Ugur, Erhan Oztop, Jun Morimoto, Shin Ishii, 2013, 7-15 p.Conference paper, Published paper (Other academic)
Abstract [en]

In this article, a new CPG-Actor-Critic architecturebased on motor primitives is proposed to perform a crawlinglearning task on a humanoid (the NAO robot). Starting froman  interdisciplinary explanation of the theories, we present twoinvestigations to test the important functions of the layeredCPG architecture: sensory feedback integration and whole-bodyposture control. Based on the analysis of the experimental results,a generic view/architecture for locomotion learning is discussedand introduced in the conclusion.

National Category
Computer Science
Identifiers
urn:nbn:se:liu:diva-106775 (URN)
Conference
2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, November 3-7, 2013, Tokyo Big Sight, Tokyo, Japan
Available from: 2014-05-22 Created: 2014-05-22 Last updated: 2014-05-26Bibliographically approved
7. DMPs based CPG Actor Critic: A Method for Locomotion Learning
Open this publication in new window or tab >>DMPs based CPG Actor Critic: A Method for Locomotion Learning
2014 (English)Manuscript (preprint) (Other academic)
Abstract [en]

In this article, a dynamic motor primitives (DMPs) based CPG-Actor-Critic is proposed to enable locomotion learning on a humanoid (the NAO robot) and a puppy robot (the ghostdog robot). In order to model two types of locomotion with one architecture, a novel application of an existing method to designa CPG architecture for learning locomotion. The method is to a) have an architectural base (4-cell CPG) and, b) have a learning component which is based on an existing method for designing DMPs. Learning locomotion here concerns gait emergence in relation to the robot’s body and prior knowledge. The focus of this article will be on two types of locomotion: crawling on ahumanoid and running on a puppy robot. On the two robots with two different morphologies, our method and architecture can make the robots learn by itself. We also compare the performance with respect to two state-of-the-art reinforcement learning algorithms with provided particular instantiations of ourDMPs-based CPG-Actor-Critic architecture. Finally, based on the analysis of the experimental results, a generic view/architecture for locomotion learning is discussed and introduced in the conclusion.

National Category
Computer Science
Identifiers
urn:nbn:se:liu:diva-106777 (URN)
Available from: 2014-05-22 Created: 2014-05-22 Last updated: 2014-05-22Bibliographically approved

Open Access in DiVA

omslag(279 kB)75 downloads
File information
File name COVER01.pdfFile size 279 kBChecksum SHA-512
a240acd60f341d01c05c1edc85aef3a98bf3cbedb56ff6d980163a27a9676105f242bd4b2027ffbe9329fa4a0d7217d7b8a8b828bb6bf9860d78b345c6bc47d5
Type coverMimetype application/pdf

Authority records BETA

Li, Cai

Search in DiVA

By author/editor
Li, Cai
By organisation
Department of Computer and Information ScienceThe Institute of Technology
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1110 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf