Enhancing Robot Social Navigation with Reinforcement Learning and Advanced Predictive Models: Cosine-Gated-LSTM and Adaptive Predictive Horizons

  • Dirichukwu Goodluck Oguzie

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

This thesis presents a comprehensive exploration of Social Robot Navigation (SocNav) in human-centric environments, a field of growing importance as robots become integral to sectors such as healthcare, hospitality, and public service. The research focuses on the integration of Reinforcement Learning (RL) with advanced predictive models to improve the navigation and interaction capabilities of robots in social environments.

A significant contribution of this work is the development and integration of our novel predictive world models into RL frameworks. These models improve the agent’s ability to predict future states, thereby improving decision-making efficiency and adaptability in a dynamic social environment. However, the initial implementation of fixed prediction horizons, such as always predicting two steps ahead in the 2StepAhead model, revealed limitations in flexibility and computational efficiency. Addressing this, we introduced an entropy-driven adaptive prediction horizon mechanism that dynamically adjusts the prediction horizon
based on real-time policy entropy, balancing computational resources with the need for long-term future state prediction.

An important method in this thesis is the introduction of the Cosine-Gated Long Short-Term Memory (CGLSTM) model. By integrating a cosine similarity-based gating mechanism with vanilla LSTM (Long Short-Term Memory) networks, CGLSTM significantly advances sequence prediction capabilities. The model consistently outperformed vanilla LSTM, GRU (Gated Recurrent Units), and RAU (Recurrent Attention Unit) models, achieving up to a 30% reduction in Mean Absolute Error (MAE) in environments such as FallingBallEnv and SocNavGym. Furthermore, integrating CGLSTM into DreamerV a state-of-the-art model-based reinforcement learning framework that learns a latent world model and plans actions through imagination resulted in an approximately 5% increase in cumulative reward, demonstrating that stronger predictive sequence models can directly enhance RL performance.

The thesis also addresses the computational challenges associated with predictive models in varying environmental complexities. The entropy adaptive prediction horizon mechanism effectively mitigates the computational challenges by adjusting the prediction horizon in response to environmental uncertainty, leading to a 15% improvement in success rates in high-entropy scenarios while maintaining computational efficiency with only a 2% increase in inference time in low-entropy situations.

Overall, this thesis significantly contributes to the advancement of SocNav and predictive modeling within RL, laying the groundwork for future research aimed at integrating robots more intuitively into our society. The developed models improve robots’ ability to navigate complex environments with improved predictive models and computational efficiency, paving the way for seamless integration into various sectors.
Date of AwardDec 2024
Original languageEnglish
Awarding Institution
  • Aston University
SupervisorLuis J. Manso (Supervisor) & Aniko Ekárt (Supervisor)

Keywords

  • Social Robot Navigation
  • Reinforcement Learning
  • Predictive World Models
  • Cosine-Gated LSTM
  • Adaptive Horizon
  • DreamerV3
  • Sequence Prediction
  • Computational Efficiency
  • Human-Robot Interaction

Cite this

'