Abstract
With the expected explosive use of the Internet of Everything in sixth generation (6G), the cybertwin network is able to convert user information to digital assets and provide extensive services. However, protecting and enhancing privacy of the processed and transmitted data in cybertwin-driven 6G is still in its infancy. Federated learning (FL) is a nascent distributed machine learning paradigm that is able to facilitate privacy protection in cybertwin networks. In a cybertwin network, imbalanced data distribution of the clients can increase the bias of the global model and sacrifice the performance of the FL model. Prior research work dealing with imbalanced data requires extra data information exchanged between clients and the server, which increases the risk of privacy leakage. To avoid privacy leakage, we design an estimation algorithm to determine the distribution of local data collected at the clients without the awareness of specific raw data. We consider two scenarios in FL: 1) the server could receive the individual trained model for each selected device and 2) the server could receive the aggregated model from the selected clients. We formulate two device selection problems to improve the training performance of the aforementioned scenarios. We develop two online learning algorithms to tackle the selection problems for both individual model uploading and aggregated model uploading. The proposed algorithms are conducted on the server, thereby avoiding privacy leakage and extra computation at the clients. We validate the effectiveness of the proposed client selection algorithms with sufficient experiments in cybertwin-driven 6G networks.
Original language | English |
---|---|
Pages (from-to) | 6733-6742 |
Number of pages | 10 |
Journal | IEEE Transactions on Industrial Informatics |
Volume | 18 |
Issue number | 10 |
Early online date | 18 Feb 2022 |
DOIs | |
Publication status | Published - Oct 2022 |
Keywords
- 6G mobile communication
- Computational modeling
- Cybertwin-driven 6G
- Data models
- Data privacy
- Performance evaluation
- Servers
- Training
- client selection
- federated learning
- imbalanced distribution
- privacy-preserving