reproducibilityindex.ai

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

Authors: Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvari, Mengdi Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted experiments with a mountain car example.
Researcher Affiliation	Collaboration	1Deepmind 2Princeton University 3University of Alberta.
Pseudocode	Yes	The pseudocode is given as Algorithm 1. In the last step, m = N samples are used to produce the final output to guarantee that the error introduced by the Monte-Carlo averaging is negligible compared to the rest.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the methodology is openly available.
Open Datasets	No	We conducted experiments with a mountain car example. We use 800 radial basis functions for linear value function approximation. The number of episodes collected by behavior policies ranges from 2 to 100.
Dataset Splits	No	The paper mentions that the dataset D is split into T nonoverlapping folds D1, . . . , DT for the algorithm, but does not specify standard training, validation, and test dataset splits with explicit percentages or sample counts for reproducing the experiment.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For each algorithm we report the performance for the best regularization parameter λ in the range {0.02, 0.05, 0.1, 0.2, 0.5}.