reproducibilityindex.ai

Hypermodels for Exploration

Authors: Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the use of hypermodels to represent epistemic uncertainty and guide exploration. We show that alternative hypermodels can enjoy dramatic efﬁciency gains, enabling behavior that would otherwise require hundreds or thousands of elements, and even succeed in situations where ensemble methods fail to learn regardless of size. Our simulation results show that a diagonal linear hypermodel requires about 50 to 100 times less computation than an ensemble hypermodel to achieve our target level of performance. In our simulations, we found that training without data perturbation gives lower regret for both agents. Figure 4 plots regret realized by TS and variance-IDS using the aforementioned hypermodel, trained with perturbed SGD.
Researcher Affiliation	Industry	Deep Mind
Pseudocode	No	The paper describes algorithms mathematically and in text but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper describes generating its own data for bandit problems (e.g., 'We generate data using a neural network...'). It does not provide access information (link, citation, or repository) for a publicly available or open dataset.
Dataset Splits	No	The paper does not explicitly provide details about train/validation/test splits, percentages, or sample counts needed to reproduce the experiments. It mentions a 'time horizon to 10,000 periods' for bandits but not formal dataset splits.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory details) used to run its experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software components or libraries used in the experiments.
Experiment Setup	Yes	Hypermodel parameters are updated according to ν ν α ν L(ν, D, Z)/\|D\| where α, σ2 w, and σ2 p are algorithm hyperparameters. In our experiments, we will take the step size α to be constant over iterations. We ﬁx the data batch size to 1024 for both agents.