reproducibilityindex.ai

Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint

Authors: Borui Zhang, Wenzhao Zheng, Jie Zhou, Jiwen Lu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform reconstruction and backtracking on the model representations optimized by Bort and observe a clear improvement in model explainability. Based on Bort, we are able to synthesize explainable adversarial samples without additional parameters and training. Surprisingly, we find Bort constantly improves the classification accuracy of various architectures including ResNet and DeiT on MNIST, CIFAR-10, and ImageNet. Code: https://github.com/zbr17/Bort.
Researcher Affiliation	Academia	Borui Zhang , Wenzhao Zheng , Jie Zhou , Jiwen Lu Department of Automation, Tsinghua University, China Beijing National Research Center for Information Science and Technology, China {zhang-br21, zhengwz18}@mails.tsinghua.edu.cn; {jzhou, lujiwen}@tsinghua.edu.cn
Pseudocode	Yes	Algorithm 1: The SAT algorithm. Input: The top feature map Z, the backtracking mapping g, number k, constant B, and threshold γ. Output: Saliency map A.
Open Source Code	Yes	Code: https://github.com/zbr17/Bort.
Open Datasets	Yes	We conduct classification experiments on MNIST, CIFAR-10, and Image Net... To begin with, we test Bort on MNIST (Deng, 2012) and CIFAR-10 (Krizhevsky et al., 2009). We evaluate Bort on the large-scale Image Net (Deng et al., 2009)...
Dataset Splits	No	The paper references standard datasets (MNIST, CIFAR-10, Image Net) but does not explicitly provide specific train/validation/test dataset splits by percentage, absolute sample counts, or direct citations to predefined splits within the text.
Hardware Specification	Yes	All experiments are conducted on one NVIDIA 3090 card. All experiments are conducted on 8 A100 cards.
Software Dependencies	No	The paper mentions software like 'Pytorch image models' but does not provide specific version numbers for any software components, libraries, or solvers used in the experiments.
Experiment Setup	Yes	We set the learning rate to 0.01 without any learning rate adjustment schedule and train each model for 40 epochs with batch size fixed to 256. No data augmentation strategy is utilized. The constraint coefficient is set to 0.1, and the weight decay is set to 0.01. For training CNN-type models (i.e., VGG16 and ResNet50), we follow the recipe in public codes (Wightman, 2019). We set the learning rate to 0.05 for SGD, 0.001 for AdamW, and 0.005 for LAMB. We utilize 3-split data augmentation including RandAugment (Cubuk et al., 2020) and Random Erasing. We train the model for 300 epochs with the batch size set to 1024 for SGD and AdamW and 2048 for LAMB. For LAMB, weight decay is 0.002 and λ coefficient to 0.00002; For SGD and AdamW, we set weight decay to 0.00002 and λ coefficient to 0.0001.