Factoring Variations in Natural Images with Deep Gaussian Mixture Models

Authors: Aaron van den Oord, Benjamin Schrauwen

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our density estimation experiments we show that deeper GMM architectures generalize better than more shallow ones, with results in the same ballpark as the state of the art.
Researcher Affiliation Academia A aron van den Oord, Benjamin Schrauwen Electronics and Information Systems department (ELIS), Ghent University {aaron.vandenoord, benjamin.schrauwen}@ugent.be
Pseudocode No The paper describes the Expectation-Maximization (EM) algorithm in detail with equations and prose, but it does not present a clearly labeled "Pseudocode" or "Algorithm" block.
Open Source Code No The paper does not provide any statements about the availability of source code or links to a code repository.
Open Datasets Yes For our experiments we used the Berkeley Segmentation Dataset (BSDS300) [13], which is a commonly used benchmark for density modeling of image patches and the tiny images dataset [14].
Dataset Splits Yes The only hyperparameters were the number of components for each layer, which were optimized on a validation set.
Hardware Specification No The paper discusses parallelization and distributed computation but does not specify any particular hardware components like CPU models, GPU models, or memory specifications used for the experiments.
Software Dependencies No The paper mentions using "LBFGS-B" for optimization, but it does not specify any version numbers for this or any other software libraries or dependencies.
Experiment Setup Yes In all the experiments described in this section, we used the following setup for training Deep GMMs. We used the hard-EM variant, with the aforementioned heuristic in the E-step. For each M-step we used LBFGS-B for 1000 iterations by using equations (13) and (14) for the objective and gradient. The total number of iterations we used for EM was fixed to 100, although fewer iterations were usually sufficient. The only hyperparameters were the number of components for each layer, which were optimized on a validation set.