reproducibilityindex.ai

Improving Calibration through the Relationship with Adversarial Robustness

Authors: Yao Qin, Xuezhi Wang, Alex Beutel, Ed Chi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we study the connection between adversarial robustness and calibration and find that the inputs for which the model is sensitive to small perturbations (are easily attacked) are more likely to have poorly calibrated predictions. ... We perform experiments on the clean test set across three datasets: CIFAR-10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) with different networks, whose architecture and accuracy are shown in Table 1.
Researcher Affiliation	Industry	Yao Qin Xuezhi Wang Alex Beutel Ed H. Chi Google Research {yaoqin, xuezhiw, alexbeutel, edchi}@google.com
Pseudocode	Yes	Algorithm 1 Training procedure for AR-Ada LS
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We perform experiments on the clean test set across three datasets: CIFAR-10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015)...
Dataset Splits	Yes	To find the best hyperparameter ϵ for label smoothing, previous methods (Szegedy et al., 2016; Thulasidasan et al., 2019) sweep ϵ in a range and choose the one that has the best validation performance. ... Speciﬁcally, we first rank the adversarial robustness of the validation data and split the validation set into R equally-sized subsets.
Hardware Specification	No	The paper mentions 'computational intensity' for Image Net experiments but does not provide any specific details about the hardware used, such as GPU models, CPU types, or cloud computing instances.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	All the methods are trained with the same network architecture, i.e., WRN-28-10 (Zagoruyko & Komodakis, 2016) on both CIFAR-10 and CIFAR-100, and the same training hyperparameters: e.g., learning rate, batch size, number of training epochs, for fair comparison. ... Please refer to Appendix A for all the training details and hyperparameters.