reproducibilityindex.ai

Learning Universal Predictors

Authors: Jordi Grau-Moya, Tim Genewein, Marcus Hutter, Laurent Orseau, Gregoire Deletang, Elliot Catt, Anian Ruoss, Li Kevin Wenliang, Christopher Mattern, Matthew Aitchison, Joel Veness

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments with neural architectures (e.g. LSTMs, Transformers) and algorithmic data generators of varying complexity and universality. Our results suggest that UTM data is a valuable resource for metalearning, and that it can be used to train neural networks capable of learning universal prediction strategies.
Researcher Affiliation	Industry	1Google Deep Mind, London, UK.
Pseudocode	Yes	Algorithm 1 Returns the number of repetitions in sequence for a given delay between symbols. def repeating_count(output, delay): count = 0 # number of equal elements for i in range(delay + 1, len(output)): if output[i] == output[i-delay]: count += 1 return count
Open Source Code	Yes	4) We open-sourced all our generators at https://github.com/google-deepmind/ neural_networks_solomonoff_induction.
Open Datasets	No	The paper describes generating data from Universal Turing Machines (UTMs), Variable-order Markov Sources (VOMS), and Chomsky Hierarchy (CH) Tasks. While these are described, there are no specific links, DOIs, or citations to publicly available datasets for direct download in the way standard datasets are typically provided. The data is generated by their own systems/methods.
Dataset Splits	No	The paper mentions 'batch size 128, sequence length 256' and evaluation 'on 6k sequences of length 256, which we refer as in-distribution... and of length 1024, referred as out-of-distribution'. It does not explicitly state train/validation/test splits in terms of percentages or counts, or how validation was handled specifically during training beyond monitoring loss.
Hardware Specification	No	The paper does not provide specific hardware specifications like GPU models (e.g., NVIDIA A100), CPU models, or cloud computing instance types used for running experiments. It only mentions 'memory-based meta-learning' and 'neural architectures (e.g. LSTMs, Transformers)' which are software concepts.
Software Dependencies	No	The paper mentions using 'ADAM optimizer (Kingma & Ba, 2014)' and 'LSTMs (Hochreiter & Schmidhuber, 1997), and Transformers (Vaswani et al., 2017)' but does not specify software versions for any libraries, frameworks (like PyTorch or TensorFlow), or Python versions.
Experiment Setup	Yes	We train for 500K iterations with batch size 128, sequence length 256, and learning rate 10^-4.