Towards Characterizing Domain Counterfactuals for Invertible Latent Causal Models

Authors: Zeyu Zhou, Ruqi Bai, Sean Kulinski, Murat Kocaoglu, David I. Inouye

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we show an improvement in counterfactual estimation over baseline methods through extensive simulated and image-based experiments.
Researcher Affiliation Academia Elmore Family School of Electrical and Computer Engineering Purdue University {zhou1059, bai116, skulinsk, mkocaoglu, dinouye}@purdue.edu
Pseudocode No The paper describes the proposed ILD estimation algorithm in Section 3.3, but it does not provide a formal pseudocode block or an algorithm listing.
Open Source Code Yes 1Code can be found in https://github.com/inouye-lab/ild-domain-counterfactuals.
Open Datasets Yes Rotated MNIST and Fashion MNIST We split the MNIST trainset into 90% training data, 10% validation, and for testing we use the MNIST test set. ... 3D Shapes This is a dataset of 3D shapes that are procedurally generated from 6 independent latent factors: floor hue, wall hue, object hue, scale, shape, and orientation (Burgess and Kim, 2018).
Dataset Splits Yes We generate 100,000 samples from each domain for the training set and 1,000 samples from each domain in the validation and test set.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as CPU/GPU models, memory, or specific cloud instance types.
Software Dependencies No The paper mentions using the 'Adam optimizer' and 'ResNet18 models', but does not specify version numbers for these or any other software libraries or frameworks used in the experiments.
Experiment Setup Yes We train each ILD model for 300K, 300K, 300K, 500K, and 200K steps for RMNIST, RFMNIST, CRMNIST, 3D Shapes and Causal3DIdent respectively using the Adam optimizer (Kingma and Ba, 2014) with β1 = 0.5, β2 = 0.999, and a batch size of 1024. The learning rate for g and g+ is 10^-4, and all f models use 10^-3. During training, we calculate two loss terms: a reconstruction loss ℓrecon = |x - ˆx|^2_2 and the ℓalign alignment loss. ... we apply a βKLD upscaling to the alignment loss such that ℓtotal = ℓrecon + βKLD ℓalign. For all MNIST-like experiments, we use βKLD = 1000, and for 3DShape and Causal3DIdent we found βKLD = 10.