reproducibilityindex.ai

Augment and Reduce: Stochastic Inference for Large Categorical Distributions

Authors: Francisco Ruiz, Michalis Titsias, Adji Bousso Dieng, David Blei

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On several large-scale classiﬁcation problems, we show that A&R provides a tighter bound on the marginal likelihood and has better predictive performance than existing approaches. We study A&R on linear classiﬁcation tasks with up to 10^4 classes. On simulated and real data, we ﬁnd that it provides accurate estimates of the categorical probabilities and gives better performance than existing approaches. 4. Experiments
Researcher Affiliation	Academia	1University of Cambridge. 2Columbia University. 3Athens University of Economics and Business.. Correspondence to: Francisco J. R. Ruiz <f.ruiz@eng.cam.ac.uk, f.ruiz@columbia.edu>.
Pseudocode	Yes	Algorithm 1 Softmax A&R for classiﬁcation. Algorithm 2 General A&R for classiﬁcation.
Open Source Code	Yes	Code for A&R is available at https://github.com/ franrruiz/augment-reduce.
Open Datasets	Yes	We consider MNIST and Bibtex (Katakis et al., 2008; Prabhu & Varma, 2014), where we can compare against the exact softmax. We also analyze Omniglot (Lake et al., 2015), EURLex-4K (Mencia & Furnkranz, 2008; Bhatia et al., 2015), and Amazon Cat-13K (Mc Auley & Leskovec, 2013). MNIST is available at http://yann.lecun.com/ exdb/mnist. Omniglot can be found at https://github. com/brendenlake/omniglot. Bibtex, EURLex-4K, and Amazon Cat-13K are available at http://manikvarma.org/ downloads/XC/XMLRepository.html.
Dataset Splits	No	Table 1 lists Ntrain and Ntest for all datasets, implying training and test splits. However, the paper does not explicitly mention separate validation set sizes or percentages for any of the datasets.
Hardware Specification	No	The paper states 'We run each approach on one CPU core' for the synthetic dataset, but it does not specify the CPU model, GPU models, memory, or any other detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions using algorithms like RMSPROP and Adagrad for step size control but does not provide specific software names with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	We initialize the weights and biases randomly, drawing from a Gaussian distribution with zero mean and standard deviation 0.1 (0.001 for the biases). We set the step size using the default parameters, i.e., ρ(t) = ρ0 t 1/2+10^16 / (1 + ps(t)) where s(t) = 0.1(g(t))^2 + 0.9s(t − 1). We set ρ0 = 0.02 and we additionally decrease ρ0 by a factor of 0.9 every 2000 iterations. We set the step size α(t) in Algorithm 1 as α(t) = (1+t)^−0.9, the default values suggested by Hoffman et al. (2013). For the step size α(t) in Algorithm 2, we set α(t) = 0.01(1 + t)^−0.9. We set the minibatch sizes \|B\| and \|S\| beforehand. The speciﬁc values for each dataset are also in Table 1.