Abstract
Retinal vessel segmentation is a critical, non-destructive medical imaging task in computer vision, essential for diagnosing fundus diseases. Although deep learning methods dominate this field, existing U-shaped encoder-decoder networks with skip connections face limitations when handling discrepancies in multi-scale features. Shallow encoder and decoder stages produce high-resolution but low-dimensional feature maps, effectively capturing fine vessel details, whereas deeper stages (such as the bottleneck) generate lower-resolution, high-dimensional feature maps rich in semantic information. Traditional U-shaped architectures often struggle to effectively integrate these distinct types of features. To address these challenges, this paper introduces a redesigned U-shaped network that incorporates modified convolution and transformer layers tailored specifically for segmenting slender and tortuous retinal vessel structures. A Multi-Core Channel-Spatial Attention (MCCSA) block replaces conventional skip connections, enhancing the extraction of high-frequency texture features in shallow stages. For deeper stages, a Pixel-level Vision Transformer (P-ViT) is introduced to model semantic interconnections among pixels, thereby improving semantic feature recognition. Furthermore, a Pixel-level residual dynamic adaptive Convolutional Neural Network (P-CNN) is proposed to better capture the intricate curved topology of blood vessels. The proposed method is evaluated on two publicly available benchmark datasets, demonstrating significant segmentation performance improvements compared to existing U-shaped methods. Our contributions include enhanced multi-scale feature integration, improved semantic feature learning, and refined extraction of vessel topology.
| Original language | English |
|---|---|
| Pages (from-to) | 23211-23226 |
| Number of pages | 16 |
| Journal | IEEE Access |
| Volume | 14 |
| Early online date | 9 Feb 2026 |
| DOIs | |
| Publication status | Published - 13 Feb 2026 |
Bibliographical note
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/Data Access Statement
The code, evaluation metrics, trained network weights, and datasets will be made publicly available at https://github.com/ziyangwang007/CVPixUNet.Keywords
- Medical image segmentation, vision transformer, convolution, retinal vessels
- Convolution, Image segmentation, Feature extraction, Retinal vessels, Convolutional neural networks, Transformers, Semantics, Decoding, Attention mechanisms, Representation learning
Fingerprint
Dive into the research topics of 'Rethinking Hybrid U-Shape Network with Pixel-Level Feature Learning for Retinal Vessel Segmentation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver