reproducibilityindex.ai

Continuous Mixtures of Tractable Probabilistic Models

Authors: Alvaro H.C. Correia, Gennaro Gala, Erik Quaeghebeur, Cassio de Campos, Robert Peharz

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments, we show that this simple scheme proves remarkably effective, as PCs learnt this way set new state of the art for tractable models on many standard density estimation benchmarks.
Researcher Affiliation	Academia	1 Eindhoven University of Technology 2 Graz University of Technology
Pseudocode	No	The paper describes methods and processes but does not include any clearly labeled pseudocode blocks or algorithm sections.
Open Source Code	Yes	Further experimental details can be found in Appendix A, and our source code is available at github.com/alcorreia/cm-tpm.
Open Datasets	Yes	We evaluated our method on common benchmarks for generative models, namely 20 standard density estimation datasets (Lowd and Davis 2010; Van Haaren and Davis 2012; Bekker et al. 2015) as well as 4 image datasets (Binary MNIST (Larochelle and Murray 2011), MNIST (Le Cun et al. 1998), Fashion MNIST (Xiao, Rasul, and Vollgraf 2017) and Street View House Numbers (SVHN) (Netzer et al. 2011)).
Dataset Splits	Yes	We ran cm(SF) and cm(SCLT) and applied LO to both final models for up to 50 epochs, using early stopping on the validation set to avoid overfitting.
Hardware Specification	No	All models were developed in python 3 with Py Torch (Paszke et al. 2019) and trained with standard commercial GPUs.
Software Dependencies	No	All models were developed in python 3 with Py Torch (Paszke et al. 2019) and trained with standard commercial GPUs.
Experiment Setup	Yes	In this set of experiments, we fixed the mixing distribution p(z) to a 4-dimensional standard Gaussian and used N = 210 integration points during training. For the decoder we used 6-layer MLPs with Leaky Re LUs activations. ... We followed the same experimental protocol as in the previous experiments, except that we employed a larger latent dimensionality of 16 and increased the number of integration points during training to 214. We did not use convolutions and stuck to 6-layer MLPs. ... For both MNIST and SVHN data, we used the same architecture and trained cm(SF) models with 16 latent dimensions and K=1 (see Efficient Learning). ... applied LO to both final models for up to 50 epochs, using early stopping on the validation set to avoid overfitting.