We study the effect of regularization in an on-line gradient-descent learning scenario for a general two-layer student network with an arbitrary number of hidden units. Training examples are randomly drawn input vectors labelled by a two-layer teacher network with an arbitrary number of hidden units which may be corrupted by Gaussian output noise. We examine the effect of weight decay regularization on the dynamical evolution of the order parameters and generalization error in various phases of the learning process, in both noiseless and noisy scenarios.
|Number of pages||7|
|Journal||Physical Review E|
|Publication status||Published - Feb 1998|
Bibliographical noteCopyright of the American Physical Society
- on-line gradient-descent learning scenario
- weight decay