Globally optimal learning rates in multilayer neural networks

David Saad; Magnus Rattray

doi:10.1080/13642819808205044

Globally optimal learning rates in multilayer neural networks

David Saad, Magnus Rattray

Research output: Contribution to journal › Article › peer-review

Abstract

A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.

Original language	English
Pages (from-to)	1523-1530
Number of pages	8
Journal	Philosophical Magazine Part B
Volume	77
Issue number	5
DOIs	https://doi.org/10.1080/13642819808205044
Publication status	Published - May 1998
Event	MINERVA workshop on Mesoscopics, Fractals and Neural Networks 25 - 27 March, 1997, Eilat, Israel - Duration: 1 May 1998 → 1 May 1998

Bibliographical note

Proceedings of the MINERVA workshop on Mesoscopics, Fractals and Neural Networks, 25-27 March 1997, Eilat (IL). This is an electronic version of an article published in Saad, David and Rattray, Magnus (1998). Globally optimal learning rates in multilayer neural networks. Philosophical Magazine Part B, 77 (5), pp. 1523-1530. Philosophical Magazine Part B is available online at: http://www.informaworld.com/openurl?genre=article&issn=1364-2812&volume=77&issue=5&spage=1523

Keywords

optimal learning rate
gradient-descent
multilayer neural networks
variational approach
generalization error
gradient descent rule

Access to Document

10.1080/13642819808205044

Globally optimal learning rates in multilayer neural networks
Proceedings of the MINERVA workshop on Mesoscopics, Fractals and Neural Networks, 25-27 March 1997, Eilat (IL). This is an electronic version of an article published in Saad, David and Rattray, Magnus (1998). Globally optimal learning rates in multilayer neural networks. Philosophical Magazine Part B, 77 (5), pp. 1523-1530. Philosophical Magazine Part B is available online at: http://www.informaworld.com/openurl?genre=article&issn=1364-2812&volume=77&issue=5&spage=1523
Accepted author manuscript, 299 KB

Cite this

@article{2a8a07d3051c467db8c69837649d86ed,

title = "Globally optimal learning rates in multilayer neural networks",

abstract = "A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.",

keywords = "optimal learning rate, gradient-descent, multilayer neural networks, variational approach, generalization error, gradient descent rule",

author = "David Saad and Magnus Rattray",

note = "Proceedings of the MINERVA workshop on Mesoscopics, Fractals and Neural Networks, 25-27 March 1997, Eilat (IL). This is an electronic version of an article published in Saad, David and Rattray, Magnus (1998). Globally optimal learning rates in multilayer neural networks. Philosophical Magazine Part B, 77 (5), pp. 1523-1530. Philosophical Magazine Part B is available online at: http://www.informaworld.com/openurl?genre=article&issn=1364-2812&volume=77&issue=5&spage=1523; MINERVA workshop on Mesoscopics, Fractals and Neural Networks 25 - 27 March, 1997, Eilat, Israel ; Conference date: 01-05-1998 Through 01-05-1998",

year = "1998",

month = may,

doi = "10.1080/13642819808205044",

language = "English",

volume = "77",

pages = "1523--1530",

journal = "Philosophical Magazine Part B",

issn = "1364-2812",

publisher = "Taylor & Francis",

number = "5",

}

TY - JOUR

T1 - Globally optimal learning rates in multilayer neural networks

AU - Saad, David

AU - Rattray, Magnus

N1 - Proceedings of the MINERVA workshop on Mesoscopics, Fractals and Neural Networks, 25-27 March 1997, Eilat (IL). This is an electronic version of an article published in Saad, David and Rattray, Magnus (1998). Globally optimal learning rates in multilayer neural networks. Philosophical Magazine Part B, 77 (5), pp. 1523-1530. Philosophical Magazine Part B is available online at: http://www.informaworld.com/openurl?genre=article&issn=1364-2812&volume=77&issue=5&spage=1523

PY - 1998/5

Y1 - 1998/5

N2 - A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.

AB - A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.

KW - optimal learning rate

KW - gradient-descent

KW - multilayer neural networks

KW - variational approach

KW - generalization error

KW - gradient descent rule

UR - http://www.scopus.com/inward/record.url?scp=0032069362&partnerID=8YFLogxK

UR - http://www.informaworld.com/openurl?genre=article&issn=1364-2812&volume=77&issue=5&spage=1523

U2 - 10.1080/13642819808205044

DO - 10.1080/13642819808205044

M3 - Article

SN - 1364-2812

VL - 77

SP - 1523

EP - 1530

JO - Philosophical Magazine Part B

JF - Philosophical Magazine Part B

IS - 5

T2 - MINERVA workshop on Mesoscopics, Fractals and Neural Networks 25 - 27 March, 1997, Eilat, Israel

Y2 - 1 May 1998 through 1 May 1998

ER -

Globally optimal learning rates in multilayer neural networks

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this