Robust Attribution Regularization

Authors: Jiefeng Chen, Xi Wu, Vaibhav Rastogi, Yingyu Liang, Somesh Jha

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate the effectiveness of our method, and also point to intriguing problems which hint at the need for better optimization techniques or better neural network architectures for robust attribution training.Through detailed experiments we study the effect of our method in robustifying attributions. On MNIST, Fashion-MNIST, GTSRB and Flower datasets, we report encouraging improvement in attribution robustness.
Researcher Affiliation Collaboration Jiefeng Chen 1 Xi Wu 2 Vaibhav Rastogi 2 Yingyu Liang 1 Somesh Jha 1,3 1 University of Wisconsin-Madison 2 Google 3 Xai Pient
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code for this paper is publicly available at the following repository: https://github.com/jfc43/ robust-attribution-regularization
Open Datasets Yes On MNIST [LCB98], Fashion-MNIST [XRV17], GTSRB [SSSI12], and Flower [NZ06] datasets
Dataset Splits No The paper mentions using specific datasets (MNIST, Fashion-MNIST, GTSRB, Flower) but does not explicitly provide details about training, validation, and test splits (e.g., percentages, sample counts, or citations to standard splits).
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running its experiments.
Software Dependencies No The paper mentions using 'Tensor Flow [ABC+16]' but does not provide a specific version number for TensorFlow or any other software dependency.
Experiment Setup Yes We propose the following gradient descent framework to optimize the objectives. The framework is parameterized by an adversary A which is supposed to solve the inner max by finding a point x? which changes attribution significantly. Table 1: Optimization parameters. Adversary A Adversary to find x?. Note that our goal is simply to maximize the inner term in a neighborhood, thus in this paper we choose Projected Gradient Descent for this purpose. m in the attack step To differentiate IG in the attack step, we use summation approximation of IG, and this is the number of segments for apprioximation. m in the gradient step Same as above, but in the gradient step. We have this m separately due to efficiency consideration. λ Regularization parameter for IG-NORM. β Regularization parameter for IG-SUM-NORM.