reproducibilityindex.ai

Tractable Optimality in Episodic Latent MABs

Authors: Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, this significantly outperforms the worst-case guarantees, as well as existing practical methods. Figure 1: Per time-step rewards for increasing lengths of episodes with history-dependent policies returned after the exploration phase
Researcher Affiliation	Collaboration	Jeongyeol Kwon University of Wisconsin-Madison jeongyeol.kwon@wisc.edu Yonathan Efroni Meta, New York jonathan.efroni@gmail.com Constantine Caramanis The University of Texas at Austin constantine@utexas.edu Shie Mannor Technion, NVIDIA shie@ee.technion.ac.il, smannor@nvidia.com
Pseudocode	Yes	Algorithm 1
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets	No	The paper describes a theoretical framework and experiments which appear to be conducted in a simulated environment based on Gaussian reward distributions, but it does not specify a named, publicly available dataset with concrete access information (e.g., link, DOI, or citation) used for training.
Dataset Splits	No	The paper states training details are in the supplementary materials, but the main text does not specify dataset splits (e.g., percentages or sample counts for training, validation, or testing).
Hardware Specification	No	Did you include the amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]
Software Dependencies	No	The paper does not provide specific software names with version numbers for reproducibility in the main text.
Experiment Setup	No	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] : in Supplementary Materials. The main text does not contain specific hyperparameters or training configurations.