Stabilizing Adversarial Invariance Induction from Divergence Minimization Perspective

Authors: Yusuke Iwasawa, Kei Akuzawa, Yutaka Matsuo

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method consistently achieves near-optimal invariance in toy datasets with various configurations in which the original AII is catastrophically unstable. Extensive experiments on four real-world datasets also support the superior performance of the proposed method, leading to improved user anonymization and domain generalization.
Researcher Affiliation Academia Yusuke Iwasawa , Kei Akuzawa and Yutaka Matsuo The University of Tokyo, Japan {iwasawa, akuzawa-kei, matsuo}@weblab.t.u-tokyo.ac.jp
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states 'We used the dataset distributed at https://github.com/ghif/mtae.' This refers to a third-party dataset, not the open-sourcing of the authors' own code for their methodology. No other explicit statement or link to the authors' source code is provided.
Open Datasets Yes Datasets We provide experimental results on the synthesized dataset and two real world tasks (four datasets) relevant to invariant feature learning: (1) user anonymization (Opportunity and USC datasets), and (2) domain generalization (MNISTR and PACS datasets). ... The Opp dataset [Sagha et al., 2011] ... The USC-HAD dataset ... [Zhang and Sawchuk, 2012]. ... The MNISTR dataset, derived from MNIST, was introduced by [Ghifary et al., 2015]. ... We used the dataset distributed at https://github.com/ghif/mtae. ... The PACS dataset is a relatively new benchmark dataset designed for cross-domain recognition [Li et al., 2017].
Dataset Splits Yes In all the experiments, we selected the data of one or several domains for the test set and used the data of a disjoint domain as the training/validation data. We split the data of the disjoint domain into groupings of 80% and 20%. We denote the test domain by a suffix (e.g., MNISTR-M0 denotes that the model is trained with the data from M15, M30, M45, M60, and M75 and evaluated on M0). We conducted 20 validations during training at equal intervals.
Hardware Specification Yes All experiments were implemented in Py Torch and were run on either a GTX 1080 or Tesla V100.
Software Dependencies No The paper states 'All experiments were implemented in Py Torch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes For all datasets and methods, we used RMSprop for optimization. For all datasets except PACS, we set the learning rate to 0.001 and the batch size to 128. For PACS, we set the learning rate to 5e 5 and the batch size to 64. The number of iterations was 10k, 5k, 20k, and 30k for MNISTR, PACS, Opp, and USC, respectively. For the adversarial-training-based method, we optimized weighting parameter λ from {0.001, 0.01, 0.1, 1.0}, except for MNISTR, for which it was optimized from {0.01, 0.1, 1.0, 10.0}. The value of α for Cross Grad was selected from {0.1, 0.25, 0.5, 0.75, 0.9}. Unless mentioned otherwise, we set the decay rate γ to 0.7.