The theory of on-line learning: a statistical physics approach

Research output: Chapter in Book/Published conference outputChapter

Abstract

In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning in multilayer neural networks using methods adopted from statistical physics. The analysis is based on monitoring a set of macroscopic variables from which the generalisation error can be calculated. A closed set of dynamical equations for the macroscopic variables is derived analytically and solved numerically. The theoretical framework is then employed for defining optimal learning parameters and for analysing the incorporation of second order information into the learning process using natural gradient descent and matrix-momentum based methods. We will also briefly explain an extension of the original framework for analysing the case where training examples are sampled with repetition.
Original languageEnglish
Title of host publicationExploratory Data Analysis in Empirical Research: Proceedings of the 25th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Munich, March 14-16, 2001
EditorsManfred Schwaiger, Otto Opit
PublisherSpringer
Pages300-309
Number of pages10
ISBN (Print)9783540441830
Publication statusPublished - 2003
EventStudies in Classification, Data Analysis and Knowledge Organization -
Duration: 1 Jan 20031 Jan 2003

Publication series

NameStudies in Classification, Data Analysis, and Knowledge Organization
PublisherSpringer-Verlag

Other

OtherStudies in Classification, Data Analysis and Knowledge Organization
Period1/01/031/01/03

Bibliographical note

The original publication is available at www.springerlink.com

Keywords

  • on-line learning
  • neural networks
  • statistical physics
  • natural gradient descent
  • matrix-momentum
  • repetition

Fingerprint

Dive into the research topics of 'The theory of on-line learning: a statistical physics approach'. Together they form a unique fingerprint.

Cite this