reproducibilityindex.ai

Path Consistency Learning in Tsallis Entropy Regularized MDPs

Authors: Yinlam Chow, Ofir Nachum, Mohammad Ghavamzadeh

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically compare sparse PCL with its soft counterpart, and show its advantage, especially in problems with a large number of actions.
Researcher Affiliation	Industry	1Google Brain 2Deep Mind.
Pseudocode	Yes	A pseudo-code of our sparse PCL algorithm can be found in Algorithm 1 in the Appendix A.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available.
Open Datasets	Yes	We demonstrate the effectiveness of the sparse PCL algorithm by comparing its performance with that of the soft PCL algorithm on a number of RL environments available in the Open AI Gym environment (Brockman et al., 2016).
Dataset Splits	No	The paper does not explicitly provide specific training/validation/test dataset splits with percentages or counts. It mentions training curves and Monte Carlo trials but no detailed data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models.
Software Dependencies	No	The paper mentions 'Open AI Gym environment' and 'recurrent neural network' but does not list specific software components with version numbers (e.g., Python, TensorFlow/PyTorch versions).
Experiment Setup	Yes	For each task and each PCL algorithm, we perform a hyper-parameter search to ﬁnd the optimal regularization weight... The functions V , µ, λ, and in the consistency equations are parameterized with a recurrent neural network with multiple heads... We discretize each continuous action with either one of the following grids: {−1, 0, 1} and {−1, 0.5, 0, 0.5, 1}.