reproducibilityindex.ai

Great Minds Think Alike: The Universal Convergence Trend of Input Salience

Authors: Yipei Wang, Jeffrey Siskind, Xiaoqian Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments shed light on the significant implications of our hypotheses in various application domains, including black-box attacks, deep ensembles, etc. These findings not only enhance our understanding of DNN behaviors but also offer valuable insights for their practical application in diverse areas of deep learning.
Researcher Affiliation	Academia	Yipei Wang, Jeffrey Mark Siskind, Xiaoqian Wang Elmore Family School of Electrical and Computer Engineering Purdue University West Lafayette, IN 47907 wang4865,qobi,joywang@purdue.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is provided as the supplementary material in the submission. The repository will be publicized upon acceptance.
Open Datasets	Yes	here we mainly follow the setups of the benign overfitting (Nakkiran et al., 2021), which also present a comprehensive study of optimized DNNs through CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009). Besides, we also include Tiny Imagenet-200 (Le and Yang, 2015) as a compromise between the computational efficiency and the dataset complexity.
Dataset Splits	No	The paper states 'experiments are carried out over the test set X = Xtest, Y = Ytest.' but does not provide specific train/validation/test dataset splits (percentages, counts, or explicit standard split references for all three parts).
Hardware Specification	Yes	They are carried out on Intel(R) Core(TM) i9-9960X CPU @ 3.10GHz with Quadro RTX 6000 GPUs.
Software Dependencies	No	The paper mentions using 'stochastic gradient descent (SGD) as the solver' but does not specify key software components with version numbers (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	As for the training process, following Nakkiran et al. (2021), we use stochastic gradient descent (SGD) as the solver, with a batch size of 128. The input data are normalized, but not augmented. We start with the initial learning rate γ0 = 0.1 and update it with γt = γ0/ 1 + t, where t is the epoch. Please refer to Appendix B for more experimental details.