Do causal predictors generalize better to new domains?

Authors: Vivian Nastl, Moritz Hardt

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study how well machine learning models trained on causal features generalize across domains. We consider 16 prediction tasks on tabular datasets...allowing us to test how well a model trained in one domain performs in another.
Researcher Affiliation Academia Vivian Y. Nastl Max Planck Institute for Intelligent Systems, Tübingen, Germany and Tübingen AI Center Max Planck ETH Center for Learning Systems vivian.nastl@tuebingen.mpg.de Moritz Hardt Max Planck Institute for Intelligent Systems, Tübingen, Germany and Tübingen AI Center hardt@is.mpg.de
Pseudocode No The paper describes experimental procedures and methods in paragraph text and figures, but it does not include formal pseudocode blocks or algorithm listings.
Open Source Code Yes Our code is based on Gardner et al. [2023], Hardt and Kim [2023] and Gulrajani and Lopez-Paz [2020]. It is available at https://github.com/socialfoundations/causal-features.
Open Datasets Yes We consider 16 prediction tasks on tabular datasets from prior work [Ding et al., 2021, Hardt and Kim, 2023, Gardner et al., 2023]...Table 1: Description of tasks, data sources and number of features in each selection.
Dataset Splits Yes We have a train/test/validation split within the in-domain set, and a test/validation split within the out-of-domain set.
Hardware Specification Yes Each job was given the same computing resources: 1 CPU. Compute nodes use AMD EPYC 7662 64-core CPUs. Memory was allocated as required for each task: all jobs were allocated at least 128GB of RAM; for the tasks Public Coverage jobs were allocated 384GB of RAM.
Software Dependencies No The paper mentions several software components and libraries, such as 'Hyper Opt [Bergstra et al., 2013]' and machine learning algorithms (XGBoost, Light GBM, IRM, REx, etc.), but it does not specify their version numbers.
Experiment Setup Yes We conduct a hyperparameter sweep using Hyper Opt [Bergstra et al., 2013] on the in-domain validation data. A method is tuned for 50 trials. We exclusively train on the training set.