Fighting Fire with Fire: Avoiding DNN Shortcuts through Priming

Authors: Chuan Wen, Jianing Qian, Jierui Lin, Jiaye Teng, Dinesh Jayaraman, Yang Gao

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On NICO image classification, Mu Jo Co continuous control, and CARLA autonomous driving, our priming strategy works significantly better than several popular state-of-the-art approaches for feature selection and data augmentation. and We conduct experiments on three sets of tasks: a toy regression experiment, an image classification task and two behavioral cloning tasks on autonomous driving and robotic control to verify our arguments and the proposed method.
Researcher Affiliation Academia 1Institute for Interdisciplinary Information Sciences, Tsinghua University 2University of Pennsylvania 3University of Texas at Austin 4Shanghai Qi Zhi Institute.
Pseudocode No The paper describes the proposed method 'Prime Net' and its architectures using text and figures, but it does not include any formal pseudocode or algorithm blocks.
Open Source Code No The paper provides a 'Project website: https://sites.google.com/view/icml22-fighting-fire-with-fire/', but it does not explicitly state that the source code for the methodology is available there, nor does it provide a direct link to a code repository.
Open Datasets Yes NICO (He et al., 2021) is an image classification dataset designed for O.O.D. settings. and CARLA is a photorealistic urban driving simulator (Dosovitskiy et al., 2017) and We evaluate our method in three standard Open AI Gym Mu Jo Co continuous control environments: Hopper, Ant and Half Cheetah.
Dataset Splits Yes We select the best hyper-parameters according to the validation accuracy and then test the models on the test set. and Thus, it is convenient to design the distribution of data by adjusting the proportions of specific contexts for training and testing images. and When testing the models in the O.O.D. region [1, 2], the model fits f2 better when primed on x4, and fits f1 better if primed on x5.
Hardware Specification No The paper does not provide specific details on the hardware used for running experiments, such as GPU or CPU models, or cloud computing instance types.
Software Dependencies No The paper mentions software components and libraries such as BASNet, SGD optimizer, Adam optimizer, and PPO, but it does not specify their version numbers.
Experiment Setup Yes We use Cross-Entropy loss and SGD optimizer to jointly train our model for 200 epochs. The initial learning rate is set to 0.05 and decays by 0.2 at 80, 120 and 160 epochs. We set the minibatch size to 128. and We use the L1 loss to train all the models. We use Adam optimizer, set the initial learning rate to 2 10 4 and decay the learning rate by 0.1 whenever the loss value no longer decreases for 5000 gradient steps. We set the minibatch size to 160 and train all the models until convergence.