Causal Action Influence Aware Counterfactual Data Augmentation
Authors: Núria Armengol Urpı́, Marco Bagatella, Marin Vlastelica, Georg Martius
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate CAIAC in two goal-conditioned settings: offline RL and offline self-supervised skill learning. |
| Researcher Affiliation | Academia | 1Department of Computer Science, ETH Zurich, Zurich, Switzerland 2Max Planck Institute for Intelligent Systems, Tübingen, Germany 3Department of Computer Science, University of Tübingen, Tübingen, Germany. |
| Pseudocode | Yes | Algorithm 1: CAIAC |
| Open Source Code | Yes | In order to ensure reproducibility of our results, we make our codebase publicly available at https://sites.google.com/view/caiac |
| Open Datasets | Yes | We make use of the data provided in the D4RL benchmark (Fu et al., 2020) |
| Dataset Splits | Yes | All models were trained for 100k gradient steps, and tested to reach low MSE error for the predictions in the validation set (train-validation split of 0.9-0.1). |
| Hardware Specification | Yes | The algorithms were benchmarked on a 12-core Intel i7 CPU. |
| Software Dependencies | No | The paper mentions software components and frameworks like "LMP", "TD3", "TD3+BC", and "Adam optimizer", but it does not specify version numbers for these or for programming languages/libraries like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | For a fine-grained description of all hyperparameters, we refer to our codebase at https://sites.google.com/view/caiac. Also, specific details like "We train each method for 1.2M gradient steps" (Appendix A.1.2) and "αBC = 2.5" (Appendix A.1.3) are provided. |