Adaptive back-propagation in on-line learning of multilayer networks

Ansgar H L West; David Saad

Adaptive back-propagation in on-line learning of multilayer networks

Ansgar H L West, David Saad

Research output: Chapter in Book/Published conference output › Chapter

Abstract

An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.

Original language	English
Title of host publication	Proceedings of the neural information processing systems
Editors	David S Touretzky, Michael C Mozer, Michael E. Hasselmo
Place of Publication	Boston
Publisher	MIT
Volume	8
ISBN (Print)	0262201070
Publication status	Published - 1996
Event	Neural Information Processing Systems 95 - Duration: 1 Jan 1996 → 1 Jan 1996

Conference

Conference	Neural Information Processing Systems 95
Period	1/01/96 → 1/01/96

Bibliographical note

Copyright of the Massachusetts Institute of Technology Press (MIT Press)

Keywords

adaptive back-propagation
algorithm
gradient descent
neural networks
statistical

Access to Document

Adaptive back-propagation in on-line learning of multilayer networks
Copyright of the Massachusetts Institute of Technology Press (MIT Press)
Final published version, 765 KB

Cite this

@inbook{e99cc3edac6b4756ac61f653b41d8291,

title = "Adaptive back-propagation in on-line learning of multilayer networks",

abstract = "An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.",

keywords = "adaptive back-propagation, algorithm, gradient descent, neural networks, statistical",

author = "West, {Ansgar H L} and David Saad",

note = "Copyright of the Massachusetts Institute of Technology Press (MIT Press); Neural Information Processing Systems 95 ; Conference date: 01-01-1996 Through 01-01-1996",

year = "1996",

language = "English",

isbn = "0262201070",

volume = "8",

editor = "Touretzky, {David S} and Mozer, {Michael C} and Hasselmo, {Michael E.}",

booktitle = "Proceedings of the neural information processing systems",

publisher = "MIT",

}

TY - CHAP

T1 - Adaptive back-propagation in on-line learning of multilayer networks

AU - West, Ansgar H L

AU - Saad, David

N1 - Copyright of the Massachusetts Institute of Technology Press (MIT Press)

PY - 1996

Y1 - 1996

N2 - An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.

AB - An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.

KW - adaptive back-propagation

KW - algorithm

KW - gradient descent

KW - neural networks

KW - statistical

UR - http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=8421

M3 - Chapter

SN - 0262201070

VL - 8

BT - Proceedings of the neural information processing systems

A2 - Touretzky, David S

A2 - Mozer, Michael C

A2 - Hasselmo, Michael E.

PB - MIT

CY - Boston

T2 - Neural Information Processing Systems 95

Y2 - 1 January 1996 through 1 January 1996

ER -

Adaptive back-propagation in on-line learning of multilayer networks

Abstract

Conference

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this