reproducibilityindex.ai

Variational Inference for Infinitely Deep Neural Networks

Authors: Achille Nazaret, David Blei

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the UDN on real and synthetic data. We find that (i) on synthetic data, the UDN achieves higher accuracy than finite neural networks of similar architecture (ii) on real data, the UDN outperforms finite neural networks and other models of infinite neural networks (iii) for both types of data, the inference adapts the UDN posterior to the data complexity, by exploring distinct sets of truncations.
Researcher Affiliation	Academia	1Department of Computer Science, Columbia University, New York, USA 2Department of Statistics, Columbia University, New York, USA.
Pseudocode	Yes	Algorithm 1 Dynamic variational inference for the UDN
Open Source Code	Yes	The code is available on Git Hub2.
Open Datasets	Yes	We study the performance of the UDN on the CIFAR-10 dataset (Krizhevsky et al., 2009). We run additional experiments on tabular datasets. We perform regression with the UDN for nine regression datasets from the UCI repository (Dua & Graff, 2017): Boston Housing (boston) Concrete Strength (concrete), Energy Efficiency (energy), Kin8nm (kin8nm), Naval Propulsion (naval), Power Plant (power), Protein Structure (protein), Wine Quality (wine) and Yacht Hydrodynamics (yacht).
Dataset Splits	Yes	For each ω, we independently sample a train, a validation and a test dataset of each 1024 samples.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud computing instance types used for running the experiments.
Software Dependencies	Yes	In libraries like Tensorflow 1.0 (Abadi et al., 2015), the computational graph is defined and compiled in advance. In contrast, a library like Py Torch (Paszke et al., 2019) uses a dynamic graph.
Experiment Setup	Yes	For each ω, we generate a dataset D(ω) on which we train the models for 4000 epochs. Prior on the neural network weights: θ N(0, 1) Prior on the truncation ℓ: ℓ 1 Poisson(0.5). Optimizer: Adam (Kingma & Ba, 2015) Learning rate: 0.005. Learning rate for λ: we use a learning rate that is 1/10th of the general learning rate of the neural network weights. Initialization of the variational truncated Poisson family: λ = 1.0 Number of epochs: 4000. Optimizer: SGD with momentum = 0.9, weight decay=1e-4 Number of epochs: 500 Learning rate schedule: [0.01]5 + [0.1]195 + [0.01]100 + [0.001]100 Learning rate for λ: we used the same learning rate for λ and the weights Initialization of the variational truncated Poisson family: λ0 = 5.0.