Abstract
In recent years, deep reinforcement learning (RL) and imitation learning (IL) have shown remarkable success in many robotics areas. However, thedomain of in-hand dexterous manipulation remains challenging for RL and IL. Achieving proficiency in these tasks often requires millions of attempts or
demonstrations before a stable strategy is learnt. Consequently, improving the learning speed and efficiency becomes paramount for RL and IL to be
practically used in real-world in-hand dexterous manipulation tasks.
This thesis primarily addressed multi-goal robot in-hand dexterous manipulation tasks, with various methods proposed to improve learning efficiency:
For RL, (1) the Goal Density-based Hindsight Experience Prioritisation (GDP) is proposed to improve learning efficiency by prioritising some experiences
during the replay stage; Furthermore, (2) another method called Policy-levelbased Curriculum Goal Selection (PL-CGS) is proposed to automatically generate goals during the learning process that could form a curriculum learning process; For IL, (3) the Goal-based Self-Adaptive Generative Adversarial Imitation Learning (Goal-SGAIL) incorporates a self-adaptive mechanism into the GAIL framework that applies to multi-goal learning scenarios.
Extensive experiments were conducted in simulation with OpenAI Gym, focusing on robot manipulation tasks, to compare the proposed methods against
existing RL and IL approaches. GDP and PL-CGS showed faster learning speed compared with the vanilla DDPG+HER method for some of the tasks in the RL experiments. For experiments in IL that involve sub-optimal demonstrations, especially those with highly sub-optimal demonstrations from human
teleoperation, Goal-SGAIL showed its ability to overcome the demonstrations’ sub-optimality and outperformed DDPGfD+HER and Goal-GAIL for some challenging in-hand manipulation tasks.
Date of Award | 5 Jun 2024 |
---|---|
Original language | English |
Awarding Institution |
|
Supervisor | Luis J. Manso (Supervisor) & George Vogiatzis (Supervisor) |
Keywords
- Reinforcement learning
- HER
- Experience prioritisation
- Curriculum learning
- Learning from demonstration
- GAIL