reproducibilityindex.ai

Maximum Causal Tsallis Entropy Imitation Learning

Authors: Kyungjae Lee, Sungjoon Choi, Songhwai Oh

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate the effectiveness of the proposed method, we conduct two simulation studies. In the ﬁrst simulation study, we verify that MCTEIL with a sparse MDN can successfully learn multimodal behaviors from expert s demonstrations. The second simulation study is conducted using four continuous control problems in Mu Jo Co [10]. MCTEIL outperforms existing methods in terms of the average cumulative return.
Researcher Affiliation	Collaboration	Kyungjae Lee1, Sungjoon Choi2, and Songhwai Oh1 Dep. of Electrical and Computer Engineering and ASRI, Seoul National University1 Kakao Brain2
Pseudocode	Yes	Algorithm 1 Maximum Causal Tsallis Entropy Imitation Learning
Open Source Code	No	The paper does not provide any specific links or explicit statements about the availability of the source code for the described methodology.
Open Datasets	Yes	The second simulation study is conducted using four continuous control problems in Mu Jo Co [10]. [10] E. Todorov, T. Erez, and Y. Tassa, Mu Jo Co: A physics engine for model-based control, in Proceedings of the International Conference on Intelligent Robots and Systems, October 2012, pp. 5026 5033.
Dataset Splits	No	The paper mentions generating demonstrations (e.g., '300 demonstrations from the expert’s policy', '50 demonstrations from the expert policy') and using varying numbers of demonstrations for training. However, it does not specify explicit training/validation/test splits with percentages, sample counts, or references to predefined splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Mu Jo Co' as a physics engine, but it does not specify any software versions for libraries, frameworks, or programming languages (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	For tested methods, 500 episodes are sampled at each iteration. We ﬁrst train the optimal policy using [3] and generate 300 demonstrations from the expert s policy. We run algorithms with varying numbers of demonstrations, 4, 11, 18, and 25, and all experiments have been repeated three times with different random seeds. For methods using an MDN, we use the best number of mixtures using a brute force search.