reproducibilityindex.ai

Transportable Representations for Domain Generalization

Authors: Kasra Jalaldoust, Elias Bareinboim

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also implemented the described estimator, and the results are depicted in Figure 3 for a set of randomly generated SCMs M1, M2, M . The red line indicates the loss for a random guess. ERM stands for empirical risk minimization (cobalt), and it simply regresses Y on R using the pooled data. Despite the popularity of ERM, we see that due to mismatch between the target domain and the sources ERM performs only slightly better than a random guess, and the performance does not improve for larger data. INV (orange) regresses Y on the best invariant representation (that is X1) using pooled data; this classifier is in fact what existing work on invariance-based domain generalization suggests. INV has a better performance compared to ERM, which is consistent with our theoretical guarantees, however, the transportability-based classifier (green) performed significantly better for larger data.
Researcher Affiliation	Academia	Causal Artificial Intelligence Laboratory Columbia University {kasra,eb}@cs.columbia.edu
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper. Methods are described in prose.
Open Source Code	No	The paper does not provide an explicit statement about releasing code or a link to a source code repository.
Open Datasets	No	The paper uses "randomly generated SCMs" for synthetic experiments but does not provide concrete access information (link, DOI, formal citation) for a publicly available or open dataset.
Dataset Splits	No	The paper discusses "finite-sample performance" and "sample size" but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using "random forests" for likelihood models but does not provide specific software names with version numbers needed to replicate the experiment.
Experiment Setup	No	The paper describes using "randomly generated SCMs", "rejection sampling for the generative models", and "random forests" for likelihood models, but does not provide specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings).