reproducibilityindex.ai

Streaming Inference for Infinite Feature Models

Authors: Rylan Schaeffer, Yilun Du, Gabrielle K Liu, Ila Fiete

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate on both synthetic and real (tabular & non-tabular) data that R-IBP matches or exceeds the performance of five streaming and non-streaming baseline inference algorithms in less time.
Researcher Affiliation	Academia	1Computer Science, Stanford University 2Brain and Cognitive Sciences, MIT 3Electrical Engineering and Computer Science, MIT 4McGovern Institute for Brain Research, MIT.
Pseudocode	No	The paper includes detailed mathematical derivations and closed-form solutions in its appendices, such as 'Variational Parameter Updates', but does not present any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions acquiring a VAE from publicly available code at 'https://github.com/jmtomczak/vae_vampprior' for one of its experiments. However, it does not state that the code for the R-IBP methodology described in the paper is open source or provide a link to it.
Open Datasets	Yes	We next tested how well R-IBP performs on real data, following the example set by (Paisley & Carin, 2009): we took the odd digits from MNIST (Lecun et al., 1998) and 'using two datasets from the UCI Machine Learning Repository (Dua & Graff, 2017): gene expression of cancer patients (801 samples, 20k features), and diabetic patient profiles (100k samples, 55 features) (Strack et al., 2014).' and 'Omniglot handwritten character images (Lake etol., 2015)'.
Dataset Splits	No	The paper mentions using metrics like negative log posterior predictive probability and sweeping hyperparameters for each algorithm. However, it does not explicitly state specific training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) used to reproduce the experiments.
Hardware Specification	No	The paper describes various experiments conducted on synthetic, MNIST, UCI tabular, and Omniglot datasets. However, it does not provide any specific hardware details such as CPU models, GPU models, or memory specifications used for running these experiments.
Software Dependencies	No	The paper mentions using 'Pyro (Bingham et al., 2019)' for Hamiltonian Monte Carlo-Gibbs Sampling and 'scikit-learn (Pedregosa et al., 2011)' for Finite Factor Analysis. It also refers to a VAE from '(Tomczak & Welling, 2018) s publicly available code'. However, specific version numbers for these software dependencies are not provided.
Experiment Setup	No	The paper states: 'Because the hyperparameters α, β, σA, σo are unknown, we swept these for each algorithm.' While this indicates hyperparameter tuning was performed, it does not explicitly provide the specific values for hyperparameters, training configurations, or system-level settings used in the experiments for reproducibility.