Abstract
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
Original language | English |
---|---|
Pages (from-to) | 151-156 |
Number of pages | 6 |
Journal | Europhysics Letters |
Volume | 34 |
Issue number | 2 |
Publication status | Published - Apr 1996 |
Bibliographical note
Copyright of EDP SciencesKeywords
- probability theory
- stochastic processes
- and statistics