reproducibilityindex.ai

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Authors: Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results in a highway driving and grid-world problems indicate that MESSI is able to take advantage of the unsupervised trajectories and improve the performance of Max Ent-IRL.
Researcher Affiliation	Collaboration	CMLA UMR 8536 Seque L team Seque L team Adobe Research & ENS Cachan INRIA Lille INRIA Lille INRIA Lille
Pseudocode	Yes	Algorithm 1 MESSI Max Ent SSIRL
Open Source Code	No	The paper does not provide a direct link to the source code for the methodology described, nor does it state that the code is released or available in supplementary materials.
Open Datasets	No	The paper mentions using 'expert trajectories' and 'unsupervised trajectories' generated from distributions like Pμ1, Pμ2, Pμ3, but it does not specify concrete access information (e.g., a link, DOI, or formal citation for a publicly available dataset) for the data used in the experiments. It describes the characteristics of the generated data rather than providing access to a pre-existing dataset.
Dataset Splits	No	The paper refers to 'l expert trajectories' and 'u unsupervised trajectories' and discusses using these in the learning process, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) typically used for reproducibility.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory).
Software Dependencies	No	The paper does not specify any software dependencies with version numbers.
Experiment Setup	Yes	Parameters. For each of the experiments, the default parameters are θmax = 500, λ0 = 0.05, the number of iterations of gradient descent is set to T = 100, one expert trajectory is provided (l = 1), and the number of unsupervised trajectories is set to u = 20 with ν = 0.5.