Monthly
288 pp. per issue
6 x 9, illustrated
ISSN
0899-7667
E-ISSN
1530-888X
2014 Impact factor:
2.21

Neural Computation

January 2008, Vol. 20, No. 1, Pages 252-270
(doi: 10.1162/neco.2008.20.1.252)
© 2007 Massachusetts Institute of Technology
Minimization of Error Functionals over Perceptron Networks
Article PDF (146.47 KB)
Abstract

Supervised learning of perceptron networks is investigated as an optimization problem. It is shown that both the theoretical and the empirical error functionals achieve minima over sets of functions computable by networks with a given number n of perceptrons. Upper bounds on rates of convergence of these minima with n increasing are derived. The bounds depend on a certain regularity of training data expressed in terms of variational norms of functions interpolating the data (in the case of the empirical error) and the regression function (in the case of the expected error). Dependence of this type of regularity on dimensionality and on magnitudes of partial derivatives is investigated. Conditions on the data, which guarantee that a good approximation of global minima of error functionals can be achieved using networks with a limited complexity, are derived. The conditions are in terms of oscillatory behavior of the data measured by the product of a function of the number of variables d, which is decreasing exponentially fast, and the maximum of the magnitudes of the squares of the L1-norms of the iterated partial derivatives of the order d of the regression function or some function, which interpolates the sample of the data. The results are illustrated by examples of data with small and high regularity constructed using Boolean functions and the gaussian function.