The role of biases in on-line learning of two-layer networks

Ansgar H.L. West, David Saad

Research output: Contribution to journalArticle

Abstract

The influence of biases on the learning dynamics of a two-layer neural network, a normalized soft-committee machine, is studied for on-line gradient descent learning. Within a statistical mechanics framework, numerical studies show that the inclusion of adjustable biases dramatically alters the learning dynamics found previously. The symmetric phase which has often been predominant in the original model all but disappears for a non-degenerate bias task. The extended model furthermore exhibits a much richer dynamical behavior, e.g. attractive suboptimal symmetric phases even for realizable cases and noiseless data.
Original languageEnglish
Pages (from-to)3265-3291
Number of pages27
JournalPhysical Review E
Volume57
Issue number3
DOIs
Publication statusPublished - Mar 1998

Fingerprint

learning
Gradient Descent
descent
statistical mechanics
Statistical Mechanics
Dynamical Behavior
Numerical Study
Inclusion
inclusions
Neural Networks
gradients
Model
Learning
Influence
Framework

Bibliographical note

Copyright of the American Physical Society

Keywords

  • learning dynamics
  • two-layer neural network
  • soft-committee machine
  • on-line gradient descent learning

Cite this

West, Ansgar H.L. ; Saad, David. / The role of biases in on-line learning of two-layer networks. In: Physical Review E. 1998 ; Vol. 57, No. 3. pp. 3265-3291.
@article{bcc8dd3d233a4c999f9c26297187f51b,
title = "The role of biases in on-line learning of two-layer networks",
abstract = "The influence of biases on the learning dynamics of a two-layer neural network, a normalized soft-committee machine, is studied for on-line gradient descent learning. Within a statistical mechanics framework, numerical studies show that the inclusion of adjustable biases dramatically alters the learning dynamics found previously. The symmetric phase which has often been predominant in the original model all but disappears for a non-degenerate bias task. The extended model furthermore exhibits a much richer dynamical behavior, e.g. attractive suboptimal symmetric phases even for realizable cases and noiseless data.",
keywords = "learning dynamics, two-layer neural network, soft-committee machine, on-line gradient descent learning",
author = "West, {Ansgar H.L.} and David Saad",
note = "Copyright of the American Physical Society",
year = "1998",
month = "3",
doi = "10.1103/PhysRevE.57.3265",
language = "English",
volume = "57",
pages = "3265--3291",
journal = "Physical Review E",
issn = "1539-3755",
publisher = "American Physical Society",
number = "3",

}

The role of biases in on-line learning of two-layer networks. / West, Ansgar H.L.; Saad, David.

In: Physical Review E, Vol. 57, No. 3, 03.1998, p. 3265-3291.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The role of biases in on-line learning of two-layer networks

AU - West, Ansgar H.L.

AU - Saad, David

N1 - Copyright of the American Physical Society

PY - 1998/3

Y1 - 1998/3

N2 - The influence of biases on the learning dynamics of a two-layer neural network, a normalized soft-committee machine, is studied for on-line gradient descent learning. Within a statistical mechanics framework, numerical studies show that the inclusion of adjustable biases dramatically alters the learning dynamics found previously. The symmetric phase which has often been predominant in the original model all but disappears for a non-degenerate bias task. The extended model furthermore exhibits a much richer dynamical behavior, e.g. attractive suboptimal symmetric phases even for realizable cases and noiseless data.

AB - The influence of biases on the learning dynamics of a two-layer neural network, a normalized soft-committee machine, is studied for on-line gradient descent learning. Within a statistical mechanics framework, numerical studies show that the inclusion of adjustable biases dramatically alters the learning dynamics found previously. The symmetric phase which has often been predominant in the original model all but disappears for a non-degenerate bias task. The extended model furthermore exhibits a much richer dynamical behavior, e.g. attractive suboptimal symmetric phases even for realizable cases and noiseless data.

KW - learning dynamics

KW - two-layer neural network

KW - soft-committee machine

KW - on-line gradient descent learning

UR - http://prola.aps.org/pdf/PRE/v57/i3/p3265_1

U2 - 10.1103/PhysRevE.57.3265

DO - 10.1103/PhysRevE.57.3265

M3 - Article

VL - 57

SP - 3265

EP - 3291

JO - Physical Review E

JF - Physical Review E

SN - 1539-3755

IS - 3

ER -