Gaussian Gated Linear Networks

Authors: David Budden, Adam Marblestone, Eren Sezener, Tor Lattimore, Gregory Wayne, Joel Veness

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conclude by providing a comprehensive set of experimental results demonstrating the impressive performance of the G-GLN algorithm across a diverse set of regression benchmarks and practical applications including contextual bandits and image denoising. Table 1: Test RMSE and standard errors for G-GLN versus three previously published methods on a standard suite of UCI regression benchmarks...
Researcher Affiliation Industry David Budden Adam H. Marblestone Eren Sezener Tor Lattimore Greg Wayne Joel Veness Deep Mind aixi@google.com. All authors are employees of Deep Mind.
Pseudocode Yes Algorithm 1 G-GLN: inference with optional update
Open Source Code Yes Open source GLN implementations (including G-GLN) are available at: www.github.com/deepmind/deepmind-research.
Open Datasets Yes We applied G-GLNs to univariate regression... We adopt the same datasets and training setup described in [32], and compare G-GLN performance to the previously published results for 3 MLP-based probabilistic methods... Our results are presented in Table 1.' Table 1 lists 'UCI regression benchmarks' such as 'Boston Housing', 'Concrete Compression Strength', etc. Section 6.2 mentions 'SARCOS dataset for a 7 degree-of-freedom robotic arm [35]'. Section 6.4 mentions 'MNIST train images'. These are all standard, publicly available datasets.
Dataset Splits Yes We adopt the same datasets and training setup described in [32], and compare G-GLN performance to the previously published results for 3 MLP-based probabilistic methods: variational inference (VI) [30], probabilistic backpropagation (PBP) [31] and the interpretation of dropout (DO) as Bayesian approximation as described in [32].' By adopting the 'same training setup' as a cited paper for standard benchmarks, it implies using their specified splits.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No All models implemented using JAX [46] and the Deep Mind JAX Ecosystem [47, 48, 49, 50].' While software frameworks are mentioned, specific version numbers for JAX or the components within the JAX Ecosystem (e.g., Chex, Haiku, Optax, RLax) used in the experiments are not provided.
Experiment Setup Yes Weights for all neurons in layer i are initialized to 1/Ki 1 where Ki 1 is the number of neurons in the previous layer.' (Section 6, Training Setup). 'Models are trained for 40 epochs and results summarized for 20 random seeds (5 for Protein).' (Section 6.1). 'G-GLNs are trained for 2000 epochs using the same test procedure as [36].' (Section 6.2).