reproducibilityindex.ai

Option Discovery in the Absence of Rewards with Manifold Analysis

Authors: Amitay Bar, Ronen Talmon, Ron Meir

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In addition, we showcase its performance in several domains, demonstrating clear improvements compared to competing methods. ... We empirically demonstrate that the learning performance obtained by our options outperforms competing options on three small-scale domains.
Researcher Affiliation	Academia	1Viterbi Faculty of Electrical Engineering, Technion, Israel Institute of Technology .
Pseudocode	Yes	Algorithm 1 Diffusion Options
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We focus on three domains: a Ring domain, which is the 2D manifold of the placement of a 2-joint robotic arm (Verma, 2008), a Maze domain (Wu et al., 2019), and a 4Rooms domain (Sutton et al., 1999).
Dataset Splits	No	The paper describes the experimental setup for Q-learning (e.g., episodes, steps, alpha, gamma) but does not specify dataset splits (e.g., train/validation/test percentages or counts) for the environments used.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions implementing Q-learning but does not list any specific software libraries, frameworks, or solvers with version numbers.
Experiment Setup	Yes	We implement Q learning (Watkins & Dayan, 1992) with α = 0.1 and γ = 0.9 for 400 episodes, containing 100 steps each. ... The main hyperparameter of the algorithm is t. In our implementation, we set t = 4.