Adversarial Vision Transformer for Medical Image Semantic Segmentation with Limited Annotations

Ziyang Wang, Chengkuan Zhao, Zixuan Ni

Research output: Unpublished contribution to conferenceUnpublished Conference Paperpeer-review

17 Citations (Scopus)

Abstract

Medical image analysis has benefited from deep learning techniques not only because of network architecture engineering, but also a large number of high-quality annotations which is time- and labour-consuming. Motivated by the recent success of Vision Transformer(ViT), we propose to explore the power of ViT for medical image semantic segmentation with an advanced Semi-Supervised Learning(SSL) fashion via MixUp-based interpolation consistency training and adversarial training. Aiming to train Segmentation ViT model(sViT) with labelled and unlabelled data simultaneously, an adversarial SSL framework that consists of a sViT and an evaluation model(EM) is proposed in this paper. During the adversarial training process, the EM is trained to classify the quality of inference of sViT is from labelled/unlabelled sample, and the sViT is initialized and trained against EM (i.e. all inference by sViT is high-quality enough to be classified as if from labelled data). To further boost the performance of sViT, MixUp-based interpolation consistency training is introduced and utilized for sViT. The whole adversarial training is designed separately for sViT and EM in an iterative manner, and the MixUp is solely for sViT. Experimental results(including replacing sViT to CNN) demonstrate the proposed method competitive performance against other SSL methods on a public benchmark data set with a variety of metrics. The code is publicly available on GitHub.

Original languageEnglish
Number of pages13
Publication statusPublished - 2022
Event33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom
Duration: 21 Nov 202224 Nov 2022

Conference

Conference33rd British Machine Vision Conference Proceedings, BMVC 2022
Country/TerritoryUnited Kingdom
CityLondon
Period21/11/2224/11/22

Fingerprint

Dive into the research topics of 'Adversarial Vision Transformer for Medical Image Semantic Segmentation with Limited Annotations'. Together they form a unique fingerprint.

Cite this