Globally optimal learning rates in multilayer neural networks

David Saad, Magnus Rattray

    Research output: Contribution to journalArticlepeer-review

    Abstract

    A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.
    Original languageEnglish
    Pages (from-to)1523-1530
    Number of pages8
    JournalPhilosophical Magazine Part B
    Volume77
    Issue number5
    DOIs
    Publication statusPublished - May 1998
    EventMINERVA workshop on Mesoscopics, Fractals and Neural Networks 25 - 27 March, 1997, Eilat, Israel -
    Duration: 1 May 19981 May 1998

    Bibliographical note

    Proceedings of the MINERVA workshop on Mesoscopics, Fractals and Neural Networks, 25-27 March 1997, Eilat (IL). This is an electronic version of an article published in Saad, David and Rattray, Magnus (1998). Globally optimal learning rates in multilayer neural networks. Philosophical Magazine Part B, 77 (5), pp. 1523-1530. Philosophical Magazine Part B is available online at: http://www.informaworld.com/openurl?genre=article&issn=1364-2812&volume=77&issue=5&spage=1523

    Keywords

    • optimal learning rate
    • gradient-descent
    • multilayer neural networks
    • variational approach
    • generalization error
    • gradient descent rule

    Fingerprint

    Dive into the research topics of 'Globally optimal learning rates in multilayer neural networks'. Together they form a unique fingerprint.

    Cite this