Causal Action Influence Aware Counterfactual Data Augmentation

Authors: Núria Armengol Urpı́, Marco Bagatella, Marin Vlastelica, Georg Martius

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate CAIAC in two goal-conditioned settings: offline RL and offline self-supervised skill learning.
Researcher Affiliation Academia 1Department of Computer Science, ETH Zurich, Zurich, Switzerland 2Max Planck Institute for Intelligent Systems, Tübingen, Germany 3Department of Computer Science, University of Tübingen, Tübingen, Germany.
Pseudocode Yes Algorithm 1: CAIAC
Open Source Code Yes In order to ensure reproducibility of our results, we make our codebase publicly available at https://sites.google.com/view/caiac
Open Datasets Yes We make use of the data provided in the D4RL benchmark (Fu et al., 2020)
Dataset Splits Yes All models were trained for 100k gradient steps, and tested to reach low MSE error for the predictions in the validation set (train-validation split of 0.9-0.1).
Hardware Specification Yes The algorithms were benchmarked on a 12-core Intel i7 CPU.
Software Dependencies No The paper mentions software components and frameworks like "LMP", "TD3", "TD3+BC", and "Adam optimizer", but it does not specify version numbers for these or for programming languages/libraries like Python, PyTorch, or TensorFlow.
Experiment Setup Yes For a fine-grained description of all hyperparameters, we refer to our codebase at https://sites.google.com/view/caiac. Also, specific details like "We train each method for 1.2M gradient steps" (Appendix A.1.2) and "αBC = 2.5" (Appendix A.1.3) are provided.