Abstract
Gait analysis is an essential technique in treating patients with lower limb dysfunctions. Traditional methods often rely on expensive and complex equipment, such as wearable body sensors and a multi-camera with marker tracking system. Aiming for a more cost-effective yet accurate alternative, this paper introduces GaitFormer, a novel approach that leverages Vision Transformer (ViT) for gait analysis using minimal, non-invasive equipment, i.e. a single low-cost RGB camera. Initially, a unique dataset using a multi-camera system with marker tracking, comprising 6 walking patterns gathered from 80 volunteers is developed. The ViT-based GaitFormer is then proposed to automatically recognize human walking patterns through a single RGB camera. GaitFormer comprises hybrid networks for each step, including: (i) a cascaded convolutional 2D human key points estimation network; (ii) a ViT-based dual-stream spatial–temporal network extending the information of human key points into 3D; (iii) leveraging specific lower limb key joints’ angle features for clinical gait analysis, capturing the geometric, kinematic, and physical attributes of human motion; (iv) employing a pure self-attention-based classification network to recognize clinical human walking patterns. The experiments are designed to comprehensively validate each step against various related baseline methods and multi-camera tracking system, with results demonstrating the promising performance of GaitFormer as an affordable, precise, and integrated solution. To the best of our knowledge, GaitFormer is the first hybrid CNN- and ViT-based end-to-end solution via low-cost device for clinically valuable gait analysis.
| Original language | English |
|---|---|
| Article number | 111810 |
| Number of pages | 15 |
| Journal | Knowledge-Based Systems |
| Volume | 295 |
| Early online date | 18 Apr 2024 |
| DOIs | |
| Publication status | Published - 8 Jul 2024 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Gait analysis
- Healthcare
- Human pose estimation
- Single RGB camera
- Vision Transformer
Fingerprint
Dive into the research topics of 'GaitFormer: Leveraging dual-stream spatial–temporal Vision Transformer via a single low-cost RGB camera for clinical gait analysis'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver