reproducibilityindex.ai

Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions

Authors: Aijun Bai, Stuart Russell

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on the benchmark Taxi domain [Dietterich, 1999] and a much more complex Robo Cup Keepaway domain [Stone et al., 2005].
Researcher Affiliation	Academia	Aijun Bai UC Berkeley aijunbai@berkeley.edu Stuart Russell UC Berkeley russell@cs.berkeley.edu
Pseudocode	Yes	Algorithm 1 gives the pseudo-code for running a HAM, where the Execute function executes an action in the environment and returns the next environment state, and the Choose function picks the next machine state given the updated stack z, the current environment state s... and Algorithm 3 gives the pseudo-code of the HAMQ-INT algorithm.
Open Source Code	No	The paper does not provide any explicit statements about the release of source code or links to a code repository for the described methodology.
Open Datasets	Yes	We conduct experiments on the benchmark Taxi domain [Dietterich, 1999] and a much more complex Robo Cup Keepaway domain [Stone et al., 2005].
Dataset Splits	No	The paper mentions general learning parameters like learning rate and exploration policy, but does not specify dataset splits (e.g., train/validation/test percentages or sample counts) or cross-validation details.
Hardware Specification	No	The paper does not provide any specific hardware details such as exact GPU/CPU models, memory amounts, or cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions using "SARSA learning rule with a linear function approximator" and refers to "ALisp" but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For all learning algorithms, the learning rate is set to be 0.125; an ϵ-Greedy policy which selects a random action with probability 0.01 is used to balance between exploration and exploitation.