Great Minds Think Alike: The Universal Convergence Trend of Input Salience
Authors: Yipei Wang, Jeffrey Siskind, Xiaoqian Wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments shed light on the significant implications of our hypotheses in various application domains, including black-box attacks, deep ensembles, etc. These findings not only enhance our understanding of DNN behaviors but also offer valuable insights for their practical application in diverse areas of deep learning. |
| Researcher Affiliation | Academia | Yipei Wang, Jeffrey Mark Siskind, Xiaoqian Wang Elmore Family School of Electrical and Computer Engineering Purdue University West Lafayette, IN 47907 wang4865,qobi,joywang@purdue.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is provided as the supplementary material in the submission. The repository will be publicized upon acceptance. |
| Open Datasets | Yes | here we mainly follow the setups of the benign overfitting (Nakkiran et al., 2021), which also present a comprehensive study of optimized DNNs through CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009). Besides, we also include Tiny Imagenet-200 (Le and Yang, 2015) as a compromise between the computational efficiency and the dataset complexity. |
| Dataset Splits | No | The paper states 'experiments are carried out over the test set X = Xtest, Y = Ytest.' but does not provide specific train/validation/test dataset splits (percentages, counts, or explicit standard split references for all three parts). |
| Hardware Specification | Yes | They are carried out on Intel(R) Core(TM) i9-9960X CPU @ 3.10GHz with Quadro RTX 6000 GPUs. |
| Software Dependencies | No | The paper mentions using 'stochastic gradient descent (SGD) as the solver' but does not specify key software components with version numbers (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | As for the training process, following Nakkiran et al. (2021), we use stochastic gradient descent (SGD) as the solver, with a batch size of 128. The input data are normalized, but not augmented. We start with the initial learning rate γ0 = 0.1 and update it with γt = γ0/ 1 + t, where t is the epoch. Please refer to Appendix B for more experimental details. |