TY - GEN
T1 - Integrating Multi-Demonstration Knowledge and Bounded Workspaces for Efficient Deep Reinforcement Learning
AU - Aflakian, Ali
AU - Stolkin, Rustam
AU - Rastegarpanah, Alireza
PY - 2024/1/1
Y1 - 2024/1/1
N2 - We propose a novel approach for boosting deep Reinforcement Learning (RL) using human demonstrations and offline workspace bounding. Our approach involves collecting data from human demonstrations on random surfaces with varying friction and stiffness properties. We then compute a 3D convex hull that encompasses all the paths taken by the demonstrators. By defining the task and the desired parameters as reward functions, we enable the reinforcement learning agent to learn an optimal solution within the bounded space, significantly reducing the search space required for the agent. We compare the training progress and the behavior of the trained policy of our approach with a baseline approach. The results demonstrate that our approach not only expedites learning but also improves the policy's performance and resilience to local minima. Combining our approach with RL also enables the use of imperfect demonstrators as their behavior can be improved during the learning. Our approach has the potential to significantly boost the development of deep RL applications in various domains, including robotics, gaming, and autonomous systems.
AB - We propose a novel approach for boosting deep Reinforcement Learning (RL) using human demonstrations and offline workspace bounding. Our approach involves collecting data from human demonstrations on random surfaces with varying friction and stiffness properties. We then compute a 3D convex hull that encompasses all the paths taken by the demonstrators. By defining the task and the desired parameters as reward functions, we enable the reinforcement learning agent to learn an optimal solution within the bounded space, significantly reducing the search space required for the agent. We compare the training progress and the behavior of the trained policy of our approach with a baseline approach. The results demonstrate that our approach not only expedites learning but also improves the policy's performance and resilience to local minima. Combining our approach with RL also enables the use of imperfect demonstrators as their behavior can be improved during the learning. Our approach has the potential to significantly boost the development of deep RL applications in various domains, including robotics, gaming, and autonomous systems.
KW - Contact-rich path following
KW - Deep reinforcement learning
KW - Human demonstrations
KW - Offline workspace bounding
UR - https://ieeexplore.ieee.org/document/10375212
UR - http://www.scopus.com/inward/record.url?scp=85182948710&partnerID=8YFLogxK
U2 - 10.1109/Humanoids57100.2023.10375212
DO - 10.1109/Humanoids57100.2023.10375212
M3 - Conference publication
AN - SCOPUS:85182948710
T3 - IEEE-RAS International Conference on Humanoid Robots
BT - 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids)
PB - IEEE
T2 - 22nd IEEE-RAS International Conference on Humanoid Robots, Humanoids 2023
Y2 - 12 December 2023 through 14 December 2023
ER -