The evolution of learning systems: to Bayes or not to be

Nestor Caticha; Juan Pablo Neirotti

doi:10.1063/1.2423276

The evolution of learning systems: to Bayes or not to be

Nestor Caticha^*, Juan Pablo Neirotti

^*Corresponding author for this work

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

Bayesian algorithms pose a limit to the performance learning algorithms can achieve. Natural selection should guide the evolution of information processing systems towards those limits. What can we learn from this evolution and what properties do the intermediate stages have? While this question is too general to permit any answer, progress can be made by restricting the class of information processing systems under study. We present analytical and numerical results for the evolution of on-line algorithms for learning from examples for neural network classifiers, which might include or not a hidden layer. The analytical results are obtained by solving a variational problem to determine the learning algorithm that leads to maximum generalization ability. Simulations using evolutionary programming, for programs that implement learning algorithms, confirm and expand the results. The principal result is not just that the evolution is towards a Bayesian limit. Indeed it is essentially reached. In addition we find that evolution is driven by the discovery of useful structures or combinations of variables and operators. In different runs the temporal order of the discovery of such combinations is unique. The main result is that combinations that signal the surprise brought by an example arise always before combinations that serve to gauge the performance of the learning algorithm. This latter structures can be used to implement annealing schedules. The temporal ordering can be understood analytically as well by doing the functional optimization in restricted functional spaces. We also show that there is data suggesting that the appearance of these traits also follows the same temporal ordering in biological systems.

Original language	English
Title of host publication	Bayesian inference and maximum entropy methods in science and engineering
Editors	Ali Mohammad-Djafari
Publisher	AIP
Pages	203-210
Number of pages	8
ISBN (Print)	978-0-7354-0371-6
DOIs	https://doi.org/10.1063/1.2423276
Publication status	Published - 29 Dec 2006
Event	Bayesian inference and maximum entropy methods In science and engineering - Paris, France Duration: 8 Jul 2006 → 13 Jul 2006

Publication series

Name	AIP conference proceedings
Publisher	AIP
Volume	872
ISSN (Print)	0094-243X
ISSN (Electronic)	1551-7616

Conference

Conference	Bayesian inference and maximum entropy methods In science and engineering
Country/Territory	France
City	Paris
Period	8/07/06 → 13/07/06

Access to Document

10.1063/1.2423276

Cite this

@inproceedings{7263e0b42e4d49d68666196993d6516a,

title = "The evolution of learning systems: to Bayes or not to be",

abstract = "Bayesian algorithms pose a limit to the performance learning algorithms can achieve. Natural selection should guide the evolution of information processing systems towards those limits. What can we learn from this evolution and what properties do the intermediate stages have? While this question is too general to permit any answer, progress can be made by restricting the class of information processing systems under study. We present analytical and numerical results for the evolution of on-line algorithms for learning from examples for neural network classifiers, which might include or not a hidden layer. The analytical results are obtained by solving a variational problem to determine the learning algorithm that leads to maximum generalization ability. Simulations using evolutionary programming, for programs that implement learning algorithms, confirm and expand the results. The principal result is not just that the evolution is towards a Bayesian limit. Indeed it is essentially reached. In addition we find that evolution is driven by the discovery of useful structures or combinations of variables and operators. In different runs the temporal order of the discovery of such combinations is unique. The main result is that combinations that signal the surprise brought by an example arise always before combinations that serve to gauge the performance of the learning algorithm. This latter structures can be used to implement annealing schedules. The temporal ordering can be understood analytically as well by doing the functional optimization in restricted functional spaces. We also show that there is data suggesting that the appearance of these traits also follows the same temporal ordering in biological systems.",

author = "Nestor Caticha and Neirotti, {Juan Pablo}",

year = "2006",

month = dec,

day = "29",

doi = "10.1063/1.2423276",

language = "English",

isbn = "978-0-7354-0371-6",

series = "AIP conference proceedings",

publisher = "AIP",

pages = "203--210",

editor = "Ali Mohammad-Djafari",

booktitle = "Bayesian inference and maximum entropy methods in science and engineering",

note = "Bayesian inference and maximum entropy methods In science and engineering ; Conference date: 08-07-2006 Through 13-07-2006",

}

Caticha, N & Neirotti, JP 2006, The evolution of learning systems: to Bayes or not to be. in A Mohammad-Djafari (ed.), Bayesian inference and maximum entropy methods in science and engineering. AIP conference proceedings, vol. 872, AIP, pp. 203-210, Bayesian inference and maximum entropy methods In science and engineering, Paris, France, 8/07/06. https://doi.org/10.1063/1.2423276

TY - GEN

T1 - The evolution of learning systems

T2 - Bayesian inference and maximum entropy methods In science and engineering

AU - Caticha, Nestor

AU - Neirotti, Juan Pablo

PY - 2006/12/29

Y1 - 2006/12/29

N2 - Bayesian algorithms pose a limit to the performance learning algorithms can achieve. Natural selection should guide the evolution of information processing systems towards those limits. What can we learn from this evolution and what properties do the intermediate stages have? While this question is too general to permit any answer, progress can be made by restricting the class of information processing systems under study. We present analytical and numerical results for the evolution of on-line algorithms for learning from examples for neural network classifiers, which might include or not a hidden layer. The analytical results are obtained by solving a variational problem to determine the learning algorithm that leads to maximum generalization ability. Simulations using evolutionary programming, for programs that implement learning algorithms, confirm and expand the results. The principal result is not just that the evolution is towards a Bayesian limit. Indeed it is essentially reached. In addition we find that evolution is driven by the discovery of useful structures or combinations of variables and operators. In different runs the temporal order of the discovery of such combinations is unique. The main result is that combinations that signal the surprise brought by an example arise always before combinations that serve to gauge the performance of the learning algorithm. This latter structures can be used to implement annealing schedules. The temporal ordering can be understood analytically as well by doing the functional optimization in restricted functional spaces. We also show that there is data suggesting that the appearance of these traits also follows the same temporal ordering in biological systems.

AB - Bayesian algorithms pose a limit to the performance learning algorithms can achieve. Natural selection should guide the evolution of information processing systems towards those limits. What can we learn from this evolution and what properties do the intermediate stages have? While this question is too general to permit any answer, progress can be made by restricting the class of information processing systems under study. We present analytical and numerical results for the evolution of on-line algorithms for learning from examples for neural network classifiers, which might include or not a hidden layer. The analytical results are obtained by solving a variational problem to determine the learning algorithm that leads to maximum generalization ability. Simulations using evolutionary programming, for programs that implement learning algorithms, confirm and expand the results. The principal result is not just that the evolution is towards a Bayesian limit. Indeed it is essentially reached. In addition we find that evolution is driven by the discovery of useful structures or combinations of variables and operators. In different runs the temporal order of the discovery of such combinations is unique. The main result is that combinations that signal the surprise brought by an example arise always before combinations that serve to gauge the performance of the learning algorithm. This latter structures can be used to implement annealing schedules. The temporal ordering can be understood analytically as well by doing the functional optimization in restricted functional spaces. We also show that there is data suggesting that the appearance of these traits also follows the same temporal ordering in biological systems.

UR - http://www.scopus.com/inward/record.url?scp=33845623445&partnerID=8YFLogxK

UR - http://scitation.aip.org/content/aip/proceeding/aipcp/10.1063/1.2423276

U2 - 10.1063/1.2423276

DO - 10.1063/1.2423276

M3 - Conference publication

AN - SCOPUS:33845623445

SN - 978-0-7354-0371-6

T3 - AIP conference proceedings

SP - 203

EP - 210

BT - Bayesian inference and maximum entropy methods in science and engineering

A2 - Mohammad-Djafari, Ali

PB - AIP

Y2 - 8 July 2006 through 13 July 2006

ER -

The evolution of learning systems: to Bayes or not to be

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Online learning in discrete hidden Markov models

Cite this