Discrete Symmetric Priors for Sparse Coding

Abstract

A standard model to explain the receptive fields of simple cells in the primary visual cortex is Sparse Coding (SC) [1]. However, the update equations used to train this model are not derivable in closed form. As a consequence, most state-of-the-art sparse coding versions use the MAP estimate for inference and training. Furthermore, it is not known if continuous hidden variables represent the best choice, e.g., for sparse coding as model for V1 processing. By using binary hidden variables, for instance, Binary Sparse Coding (BSC) [2], or [3], alternative priors with discrete hidden variables have been investigated in the past. The binary hidden space allows for analytically derivable update rules in closed-form and thus does not require a MAP estimation. However, in contrast, e.g., to Laplace priors, the Bernoulli distribution is not symmetric and its mean is not zero. To study the implications of discrete hidden variables independent of differences in prior symmetries, we, in this work, investigate a generative model with symmetrical and discrete prior distribution. Furthermore, a generative model with such a prior directly connects to recent sparse coding versions with hard-sparseness constraint (compare, e.g., [4]). As model for a discrete and symmetric prior, we use a multinomial distribution for hidden variables that can take on the values -1,0 and 1. In numerical experiments, we train the model using Expectation Truncation (ET) [5], a variational EM method which uses a preselection of hidden variables to increase learning efficiency. To show the effectiveness of the algorithm, we adjusted the linear bars test described in the BSC paper to fit our model. In the linear bars test the model was able to learn both the basis functions and the data noise. The linear bars test also provides considerable evidence that training the parameters using ET reduces the number of local optima. In experiments on more realistic data, we applied the algorithm to 50,000 large scale image patches (26x26 pixels) taken from the van Hateren image data base [6] and pre-processed with pseudo-whitening, using massive parallel computing. In this experiment, we obtained Gabor-like basis functions with similar properties as reported for receptive fields of V1 simple cells. We analyze the obtained Gabors and discuss differences and similarities to different sparse coding versions in the literature.

Publication
In Bernstein Conference 2011.
comments powered by Disqus