Offline Imitation Learning with Variational Counterfactual Reasoning

Authors: Zexu Sun, Bowei He, Jinxin Liu, Xu Chen, Chen Ma, Shuai Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Moreover, we conduct extensive experiments to demonstrate that our approach significantly outperforms various baselines on both DEEPMIND CONTROL SUITE benchmark for in-distribution performance and CAUSALWORLD benchmark for out-of-distribution generalization. and 5 Experiments
Researcher Affiliation Collaboration Gaoling School of Artificial Intelligence, Renmin University of China Department of Computer Science, City University of Hong Kong School of Engineering, Westlake University Di Di Chuxing
Pseudocode Yes Algorithm 1 Training procedure of OILCA.
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct extensive experiments to demonstrate that our approach significantly outperforms various baselines on both DEEPMIND CONTROL SUITE benchmark for in-distribution performance and CAUSALWORLD benchmark for out-of-distribution generalization.
Dataset Splits No The paper mentions data collection details in Appendix E.1 (which is not provided in the given text) and states "We collect the offline data by using three different do-interventions on environment features (stage_color, stage_friction. floor_friction) to generate offline datasets", but it does not explicitly provide specific training/validation/test dataset splits with percentages, sample counts, or clear predefined split references within the provided text.
Hardware Specification No The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific libraries and their versions) that are needed to replicate the experiments.
Experiment Setup No The paper states "the hyper-parameters of our method and baselines are all detail-tuned for better performance" and mentions "hyperparameters η, α" in Algorithm 1, but it does not provide concrete values for these hyperparameters or other specific training configurations (e.g., learning rate, batch size, number of epochs, optimizer settings) within the provided text.