Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks

Binbin Huang; Xiao Liu; Shangguang Wang; Linxuan Pan; Victor Chang

doi:10.1016/j.comnet.2021.108176

Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks

Binbin Huang, Xiao Liu, Shangguang Wang, Linxuan Pan, Victor Chang^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

In device-to-device (D2D) networks, multiple resource-limited mobile devices cooperate with one another to execute computation tasks. As the battery capacity of mobile devices is limited, the computation tasks running on the mobile devices will terminate once the battery is dead. In order to achieve sustainable computation, energy-harvesting technology has been introduced into D2D networks. At present, how to make multiple energy harvesting mobile devices work collaboratively to minimize the long-term system cost for task execution under limited computing, network and battery capacity constraint is a challenging issue. To deal with such a challenge, in this paper, we design a multi-agent deep deterministic policy gradient (MADDPG) based cost-aware collaborative task-execution (CACTE) scheme in energy harvesting D2D (EH-D2D) networks. To validate the CACTE scheme's performance, we conducted extensive experiments to compare the CACTE scheme with four baseline algorithms, including Local, Random, ECLB (Energy Capacity Load Balance) and CCLB (Computing Capacity Load Balance). Experiments were accompanied by various system parameters, such as the mobile device's battery capacity, task workload, the bandwidth and so on. The experimental results show that the CACTE scheme can make multiple mobile devices cooperate effectively with one another to execute many more tasks and achieve a higher long-term reward, including lower task latency and fewer dropped tasks.

Original language	English
Article number	108176
Journal	Computer Networks
Volume	195
Early online date	29 May 2021
DOIs	https://doi.org/10.1016/j.comnet.2021.108176
Publication status	Published - 4 Aug 2021

Bibliographical note

© 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/

Funding Information:
This work was supported by the National Science Foundation of China (No. 61802095, 61572162, 61572251), the Zhejiang Provincial National Science Foundation of China (No. LQ19F020011, LQ17F020003), the Zhejiang Provincial Key Science and Technology Project Foundation (NO. 2018C01012), and the Open Foundation of State Key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (No. SKLNST-2019-2-15) and VC Research (VCR 0000111).

Keywords

collaborative task execution
cost-aware
D2D networks
multi-agent deep deterministic policy gradient
partially observable Markov decision process

Access to Document

10.1016/j.comnet.2021.108176

Multi-agent reinforcement learning for cost-aware collaborative task
© 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/
Accepted author manuscript, 1.74 MBLicence: CC BY-NC-ND 4.0

Cite this

@article{f0e0be2b878f4f9c89ff1bcabbb587e2,

title = "Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks",

abstract = "In device-to-device (D2D) networks, multiple resource-limited mobile devices cooperate with one another to execute computation tasks. As the battery capacity of mobile devices is limited, the computation tasks running on the mobile devices will terminate once the battery is dead. In order to achieve sustainable computation, energy-harvesting technology has been introduced into D2D networks. At present, how to make multiple energy harvesting mobile devices work collaboratively to minimize the long-term system cost for task execution under limited computing, network and battery capacity constraint is a challenging issue. To deal with such a challenge, in this paper, we design a multi-agent deep deterministic policy gradient (MADDPG) based cost-aware collaborative task-execution (CACTE) scheme in energy harvesting D2D (EH-D2D) networks. To validate the CACTE scheme's performance, we conducted extensive experiments to compare the CACTE scheme with four baseline algorithms, including Local, Random, ECLB (Energy Capacity Load Balance) and CCLB (Computing Capacity Load Balance). Experiments were accompanied by various system parameters, such as the mobile device's battery capacity, task workload, the bandwidth and so on. The experimental results show that the CACTE scheme can make multiple mobile devices cooperate effectively with one another to execute many more tasks and achieve a higher long-term reward, including lower task latency and fewer dropped tasks.",

keywords = "collaborative task execution, cost-aware, D2D networks, multi-agent deep deterministic policy gradient, partially observable Markov decision process",

author = "Binbin Huang and Xiao Liu and Shangguang Wang and Linxuan Pan and Victor Chang",

note = "{\textcopyright} 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ Funding Information: This work was supported by the National Science Foundation of China (No. 61802095, 61572162, 61572251), the Zhejiang Provincial National Science Foundation of China (No. LQ19F020011, LQ17F020003), the Zhejiang Provincial Key Science and Technology Project Foundation (NO. 2018C01012), and the Open Foundation of State Key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (No. SKLNST-2019-2-15) and VC Research (VCR 0000111). ",

year = "2021",

month = aug,

day = "4",

doi = "10.1016/j.comnet.2021.108176",

language = "English",

volume = "195",

}

TY - JOUR

T1 - Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks

AU - Huang, Binbin

AU - Liu, Xiao

AU - Wang, Shangguang

AU - Pan, Linxuan

AU - Chang, Victor

N1 - © 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ Funding Information: This work was supported by the National Science Foundation of China (No. 61802095, 61572162, 61572251), the Zhejiang Provincial National Science Foundation of China (No. LQ19F020011, LQ17F020003), the Zhejiang Provincial Key Science and Technology Project Foundation (NO. 2018C01012), and the Open Foundation of State Key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (No. SKLNST-2019-2-15) and VC Research (VCR 0000111).

PY - 2021/8/4

Y1 - 2021/8/4

N2 - In device-to-device (D2D) networks, multiple resource-limited mobile devices cooperate with one another to execute computation tasks. As the battery capacity of mobile devices is limited, the computation tasks running on the mobile devices will terminate once the battery is dead. In order to achieve sustainable computation, energy-harvesting technology has been introduced into D2D networks. At present, how to make multiple energy harvesting mobile devices work collaboratively to minimize the long-term system cost for task execution under limited computing, network and battery capacity constraint is a challenging issue. To deal with such a challenge, in this paper, we design a multi-agent deep deterministic policy gradient (MADDPG) based cost-aware collaborative task-execution (CACTE) scheme in energy harvesting D2D (EH-D2D) networks. To validate the CACTE scheme's performance, we conducted extensive experiments to compare the CACTE scheme with four baseline algorithms, including Local, Random, ECLB (Energy Capacity Load Balance) and CCLB (Computing Capacity Load Balance). Experiments were accompanied by various system parameters, such as the mobile device's battery capacity, task workload, the bandwidth and so on. The experimental results show that the CACTE scheme can make multiple mobile devices cooperate effectively with one another to execute many more tasks and achieve a higher long-term reward, including lower task latency and fewer dropped tasks.

AB - In device-to-device (D2D) networks, multiple resource-limited mobile devices cooperate with one another to execute computation tasks. As the battery capacity of mobile devices is limited, the computation tasks running on the mobile devices will terminate once the battery is dead. In order to achieve sustainable computation, energy-harvesting technology has been introduced into D2D networks. At present, how to make multiple energy harvesting mobile devices work collaboratively to minimize the long-term system cost for task execution under limited computing, network and battery capacity constraint is a challenging issue. To deal with such a challenge, in this paper, we design a multi-agent deep deterministic policy gradient (MADDPG) based cost-aware collaborative task-execution (CACTE) scheme in energy harvesting D2D (EH-D2D) networks. To validate the CACTE scheme's performance, we conducted extensive experiments to compare the CACTE scheme with four baseline algorithms, including Local, Random, ECLB (Energy Capacity Load Balance) and CCLB (Computing Capacity Load Balance). Experiments were accompanied by various system parameters, such as the mobile device's battery capacity, task workload, the bandwidth and so on. The experimental results show that the CACTE scheme can make multiple mobile devices cooperate effectively with one another to execute many more tasks and achieve a higher long-term reward, including lower task latency and fewer dropped tasks.

KW - collaborative task execution

KW - cost-aware

KW - D2D networks

KW - multi-agent deep deterministic policy gradient

KW - partially observable Markov decision process

UR - http://www.scopus.com/inward/record.url?scp=85107055519&partnerID=8YFLogxK

UR - https://www.sciencedirect.com/science/article/pii/S1389128621002334?via%3Dihub

U2 - 10.1016/j.comnet.2021.108176

DO - 10.1016/j.comnet.2021.108176

M3 - Article

AN - SCOPUS:85107055519

SN - 1389-1286

VL - 195

JO - Computer Networks

JF - Computer Networks

M1 - 108176

ER -

Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this