An adaptive back-propagation algorithm is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, both numerical studies and a rigorous analysis show that the adaptive back-propagation method results in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.
|Title of host publication||Proceedings of the neural information processing systems|
|Editors||David S Touretzky, Michael C Mozer, Michael E. Hasselmo|
|Place of Publication||Boston|
|Publication status||Published - 1996|
|Event||Neural Information Processing Systems 95 - |
Duration: 1 Jan 1996 → 1 Jan 1996
|Conference||Neural Information Processing Systems 95|
|Period||1/01/96 → 1/01/96|
Bibliographical noteCopyright of the Massachusetts Institute of Technology Press (MIT Press)
- adaptive back-propagation
- gradient descent
- neural networks
West, A. H. L., & Saad, D. (1996). Adaptive back-propagation in on-line learning of multilayer networks. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Proceedings of the neural information processing systems (Vol. 8). MIT.