288 pp. per issue
6 x 9, illustrated
2014 Impact factor:

Neural Computation

April 1, 1996, Vol. 8, No. 3, Pages 595-609
(doi: 10.1162/neco.1996.8.3.595)
© 1996 Massachusetts Institute of Technology
Minimum Description Length, Regularization, and Multimodal Data
Article PDF (757.51 KB)

Relationships between clustering, description length, and regularization are pointed out, motivating the introduction of a cost function with a description length interpretation and the unusual and useful property of having its minimum approximated by the densest mode of a distribution. A simple inverse kinematics example is used to demonstrate that this property can be used to select and learn one branch of a multivalued mapping. This property is also used to develop a method for setting regularization parameters according to the scale on which structure is exhibited in the training data. The regularization technique is demonstrated on two real data sets, a classification problem and a regression problem.