TY - JOUR
T1 - Robust Contact-Rich Task Learning With Reinforcement Learning and Curriculum-Based Domain Randomization
AU - Aflakian, Ali
AU - Hathaway, Jamie
AU - Stolkin, Rustam
AU - Rastegarpanah, Alireza
N1 - Copyright © 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
PY - 2024/8/5
Y1 - 2024/8/5
N2 - We propose a framework for contact-rich path following with reinforcement learning based on a mixture of visual and tactile feedback to achieve path following on unknown environments. We employ a curriculum-based domain randomisation approach with a time-varying sampling distribution, rendering our approach is robust to parametric uncertainties in the robot-environment system. Based on evaluation in simulation for compliant path-following case studies with a random uncertain environment, and comparison with LBMPC and FDM methods, the robustness of the obtained policy over a stiffness range 104 – 109 N/m and friction range 0.1–1.2 is demonstrated. We extend this concept to unknown surfaces with various surface curvatures to enhance the robustness of the trained policy in terms of changes in surfaces. We demonstrate ∼15× improvement in trajectory accuracy compared to the previous LBMPC method and ∼18× improvement compared to using the FDM approach. We suggest the applications of the proposed method for learning more challenging tasks such as milling, which are difficult to model and dependent on a wide range of process variables.
AB - We propose a framework for contact-rich path following with reinforcement learning based on a mixture of visual and tactile feedback to achieve path following on unknown environments. We employ a curriculum-based domain randomisation approach with a time-varying sampling distribution, rendering our approach is robust to parametric uncertainties in the robot-environment system. Based on evaluation in simulation for compliant path-following case studies with a random uncertain environment, and comparison with LBMPC and FDM methods, the robustness of the obtained policy over a stiffness range 104 – 109 N/m and friction range 0.1–1.2 is demonstrated. We extend this concept to unknown surfaces with various surface curvatures to enhance the robustness of the trained policy in terms of changes in surfaces. We demonstrate ∼15× improvement in trajectory accuracy compared to the previous LBMPC method and ∼18× improvement compared to using the FDM approach. We suggest the applications of the proposed method for learning more challenging tasks such as milling, which are difficult to model and dependent on a wide range of process variables.
UR - https://ieeexplore.ieee.org/document/10606428
U2 - 10.1109/ACCESS.2024.3432644
DO - 10.1109/ACCESS.2024.3432644
M3 - Article
SN - 2169-3536
VL - 12
SP - 103461
EP - 103472
JO - IEEE Access
JF - IEEE Access
ER -