A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning

Kristof van Moffaert; Tim Brys; Arjun Chandra; Lukas Esterle; Peter R. Lewis; Ann Nowé

doi:10.1109/IJCNN.2014.6889637

A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning

Kristof van Moffaert, Tim Brys, Arjun Chandra, Lukas Esterle, Peter R. Lewis, Ann Nowé

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space.

Original language	English
Title of host publication	Proceedings of the International Joint Conference on Neural Networks
Publisher	IEEE
Pages	2306-2314
Number of pages	9
ISBN (Print)	978-1-4799-6627-1
DOIs	https://doi.org/10.1109/IJCNN.2014.6889637
Publication status	Published - 2014
Event	2014 International Joint Conference on Neural Networks - Beijing, China Duration: 6 Jul 2014 → 11 Jul 2014

Conference

Conference	2014 International Joint Conference on Neural Networks
Abbreviated title	IJCNN 2014
Country/Territory	China
City	Beijing
Period	6/07/14 → 11/07/14

Bibliographical note

© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Access to Document

10.1109/IJCNN.2014.6889637

Weight selection algorithm for multi-objective multi-agent reinforcement learning
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Accepted author manuscript, 4.16 MB

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6889637

Cite this

@inproceedings{10ce4457c4274c7b80cddb04a6a421dc,

title = "A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning",

abstract = "To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space.",

author = "{van Moffaert}, Kristof and Tim Brys and Arjun Chandra and Lukas Esterle and Lewis, {Peter R.} and Ann Now{\'e}",

note = "{\textcopyright} 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.; 2014 International Joint Conference on Neural Networks, IJCNN 2014 ; Conference date: 06-07-2014 Through 11-07-2014",

year = "2014",

doi = "10.1109/IJCNN.2014.6889637",

language = "English",

isbn = "978-1-4799-6627-1",

pages = "2306--2314",

booktitle = "Proceedings of the International Joint Conference on Neural Networks",

publisher = "IEEE",

address = "United States",

}

van Moffaert, K, Brys, T, Chandra, A, Esterle, L, Lewis, PR & Nowé, A 2014, A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning. in Proceedings of the International Joint Conference on Neural Networks. IEEE, pp. 2306-2314, 2014 International Joint Conference on Neural Networks, Beijing, China, 6/07/14. https://doi.org/10.1109/IJCNN.2014.6889637

TY - GEN

T1 - A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning

AU - van Moffaert, Kristof

AU - Brys, Tim

AU - Chandra, Arjun

AU - Esterle, Lukas

AU - Lewis, Peter R.

AU - Nowé, Ann

N1 - © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

PY - 2014

Y1 - 2014

N2 - To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space.

AB - To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space.

UR - http://www.scopus.com/inward/record.url?scp=84908474542&partnerID=8YFLogxK

U2 - 10.1109/IJCNN.2014.6889637

DO - 10.1109/IJCNN.2014.6889637

M3 - Conference publication

AN - SCOPUS:84908474542

SN - 978-1-4799-6627-1

SP - 2306

EP - 2314

BT - Proceedings of the International Joint Conference on Neural Networks

PB - IEEE

T2 - 2014 International Joint Conference on Neural Networks

Y2 - 6 July 2014 through 11 July 2014

ER -

A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning

Abstract

Conference

Bibliographical note

Access to Document

Other files and links

Fingerprint

Cite this