Large Deviation Analysis of Function Sensitivity in Random Deep Neural Networks

Research output: Contribution to journalArticle

Abstract

Mean field theory has been successfully used to analyze deep neural networks (DNN) in the infinite size limit. Given the finite size of realistic DNN, we utilize the large deviation theory and path integral analysis to study the deviation of functions represented by DNN from their typical mean field solutions. The parameter perturbations investigated include weight sparsification (dilution) and binarization, which are commonly used in model simplification, for both ReLU and sign activation functions. We find that random networks with ReLU activation are more robust to parameter perturbations with respect to their counterparts with sign activation, which arguably is reflected in the simplicity of the functions they generate.
Original languageEnglish
JournalJournal of Physics A: Mathematical and Theoretical
Publication statusAccepted/In press - 9 Jan 2020

Fingerprint

Random Networks
Large Deviations
Chemical activation
Parameter Perturbation
Neural Networks
deviation
Activation
sensitivity
activation
Model Simplification
Large Deviation Theory
Binarization
Mean field theory
Activation Function
Mean-field Theory
Curvilinear integral
perturbation
Mean Field
Dilution
Simplicity

Bibliographical note

© 2019 The Authors

Cite this

@article{bd435a2339ec479b8edc510eae3a90d4,
title = "Large Deviation Analysis of Function Sensitivity in Random Deep Neural Networks",
abstract = "Mean field theory has been successfully used to analyze deep neural networks (DNN) in the infinite size limit. Given the finite size of realistic DNN, we utilize the large deviation theory and path integral analysis to study the deviation of functions represented by DNN from their typical mean field solutions. The parameter perturbations investigated include weight sparsification (dilution) and binarization, which are commonly used in model simplification, for both ReLU and sign activation functions. We find that random networks with ReLU activation are more robust to parameter perturbations with respect to their counterparts with sign activation, which arguably is reflected in the simplicity of the functions they generate.",
author = "Bo Li and David Saad",
note = "{\circledC} 2019 The Authors",
year = "2020",
month = "1",
day = "9",
language = "English",
journal = "Journal of Physics A: Mathematical and Theoretical",
issn = "1751-8113",
publisher = "IOP Publishing Ltd.",

}

TY - JOUR

T1 - Large Deviation Analysis of Function Sensitivity in Random Deep Neural Networks

AU - Li, Bo

AU - Saad, David

N1 - © 2019 The Authors

PY - 2020/1/9

Y1 - 2020/1/9

N2 - Mean field theory has been successfully used to analyze deep neural networks (DNN) in the infinite size limit. Given the finite size of realistic DNN, we utilize the large deviation theory and path integral analysis to study the deviation of functions represented by DNN from their typical mean field solutions. The parameter perturbations investigated include weight sparsification (dilution) and binarization, which are commonly used in model simplification, for both ReLU and sign activation functions. We find that random networks with ReLU activation are more robust to parameter perturbations with respect to their counterparts with sign activation, which arguably is reflected in the simplicity of the functions they generate.

AB - Mean field theory has been successfully used to analyze deep neural networks (DNN) in the infinite size limit. Given the finite size of realistic DNN, we utilize the large deviation theory and path integral analysis to study the deviation of functions represented by DNN from their typical mean field solutions. The parameter perturbations investigated include weight sparsification (dilution) and binarization, which are commonly used in model simplification, for both ReLU and sign activation functions. We find that random networks with ReLU activation are more robust to parameter perturbations with respect to their counterparts with sign activation, which arguably is reflected in the simplicity of the functions they generate.

M3 - Article

JO - Journal of Physics A: Mathematical and Theoretical

JF - Journal of Physics A: Mathematical and Theoretical

SN - 1751-8113

ER -