Neural Networks for Learning Counterfactual G-Invariances from Single Environments

Authors: S Chandra Mouli, Bruno Ribeiro

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now provide empirical results of 12 different tasks to showcase the properties and advantages of our framework 1. Due to space limitations, our results are only briefly summarized here, with most of the details described in Appendix G.
Researcher Affiliation Academia S Chandra Mouli Department of Computer Science Purdue University chandr@purdue.edu Bruno Ribeiro Department of Computer Science Purdue University ribeiro@cs.purdue.edu
Pseudocode Yes Appendix D PSEUDOCODE FOR THEOREM 3 We present the algorithm for Theorem 3 in Algorithm 1.
Open Source Code Yes 1Public code available at: https://github.com/Purdue MINDS/NN_CGInvariance
Open Datasets Yes Datasets. We consider the standard MNIST dataset and its subset MNIST-34 that contains only the digits 3 & 4 alone.
Dataset Splits Yes In order to evaluate the models, we use 5-fold cross-validation procedure as follows. We divide the training and test datasets that are pre-split in MNIST and MNIST-34 datasets into 5 folds each. We use the above procedure to transform the training data and the test data. Then in each iteration i of the cross-validation procedure, we leave out i-th fold of the transformed training data and i-th fold of the extrapolated test data. Further, we use 20% of the training data as validation data for hyperparameter tuning and early stopping.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU/CPU models or cloud instance types.
Software Dependencies No The paper mentions optimizers like 'SGD with momentum' and 'Adam' but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers.
Experiment Setup Yes We optimize all models using SGD with momentum with learning rate in {10 2, 10 3, 10 4} and a batch size of 64. We use early stopping on validation loss to select the best model.