Weakly supervised causal representation learning
Authors: Johann Brehmer, Pim de Haan, Phillip Lippe, Taco S. Cohen
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On simple image data, including a novel dataset of simulated robotic manipulation, we demonstrate that such models can reliably identify the causal structure and disentangle causal variables. and Finally, we demonstrate ILCMs on synthetic datasets, including the new Causal Circuit dataset of a robot arm interacting with a causally connected system of light switches. We show that these models can robustly learn the true causal variables and the causal structure from pixels. 5 Experiments Finally, we demonstrate latent causal models in practice. |
| Researcher Affiliation | Collaboration | Johann Brehmer Qualcomm AI Research jbrehmer@qti.qualcomm.com Pim de Haan Qualcomm AI Research QUVA Lab, University of Amsterdam pim@qti.qualcomm.com Phillip Lippe QUVA Lab, University of Amsterdam p.lippe@uva.nl Taco Cohen Qualcomm AI Research tacos@qti.qualcomm.com |
| Pseudocode | No | The paper describes algorithms (e.g., heuristic for causal discovery) but does not present them in a structured pseudocode or algorithm block format. |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] Not yet, but we aim to publish them as soon as we obtain approval to do so. |
| Open Datasets | Yes | We test ILCMs on an adaptation of the Causal3DIdent dataset [14] and We use the Causal3DIdent dataset from von Kügelgen et al. [14]. And We introduce a new dataset, which we call Causal Circuit. and We strive to publish our Causal Circuit dataset as soon as possible. |
| Dataset Splits | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] We provide these details in Appendix D. and For each dimension n, we generate three datasets, using linear SCMs with random DAGs... We generate 50,000 training pairs and 10,000 test pairs for each graph. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] |
| Software Dependencies | No | The paper mentions PyTorch as the implementation framework, but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | All models are trained for 200 epochs unless stated otherwise (for the Causal Circuit and 2D toy experiment we train models for 1000 epochs). We use the Adam optimizer [36] with β1 = 0.9, β2 = 0.999. We use a learning rate of 1e-4 with a cosine annealing schedule over the number of epochs. Batch size of 128 (64 for the d VAE and slot attention baselines). |