Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

AdaptGrad: Adaptive Sampling to Reduce Noise

Authors: Linjiang Zhou, Chao Ma, Zepeng Wang, Libing Wu, XIAOCHUAN SHI

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments, both qualitative and quantitative, demonstrate that Adapt Grad could effectively reduce almost all the noise in vanilla gradients compared to baseline methods. Adapt Grad is simple and universal, making it a practical solution to enhance gradient-based interpretability methods to achieve clearer visualization. All code would be found in https://github.com/Ai Share-WHU/Adapt Grad.
Researcher Affiliation	Academia	1Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China 2School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China EMAIL
Pseudocode	No	The paper describes methods using equations like Equation 10, 11, and 12, but does not present them in a structured pseudocode block or algorithm environment.
Open Source Code	Yes	All code would be found in https://github.com/Ai Share-WHU/Adapt Grad. ... All experimental codes and detailed results can be found in the Supplementary Material, and will be released on the public code platform under the anonymous policy. ... We will publicly release all of our code, as well as the methods for obtaining the data, on open platforms github.com.
Open Datasets	Yes	To apply these metrics for comprehensive evaluation, following the experimental setup in [25, 2], we choose MNIST [27] for experiments on Consistency and Invariance, ILSVRC2012 (Image Net) [26] for experiments on Sparseness and Faithfulness.
Dataset Splits	No	In the quantitative evaluation, due to the time-consuming computation of the experiment, we randomly sampled 1,000 samples instead of all validation or test set samples for the comparison experiments. This also led to difficulties in reporting the statistical significance of our experimental results.
Hardware Specification	Yes	Our experiments were conducted on a server with 4 NVIDIA RTX 4090 and 2 Intel Xeon Gold 6128.
Software Dependencies	No	VGG16, Res Net50, and Inception V3 are constructed by pre-trained models released in Torchvision 2. The footnote links to "https://pytorch.org/vision/stable/index.html", implying PyTorch, but specific version numbers for PyTorch or Torchvision are not explicitly stated in the text.
Experiment Setup	Yes	All examples and experiments in this article use the settings of N = 50, c = 0.95, α = 0.2. ... The MLP was trained on MNIST by the SGD optimizer with 20 epochs, and the learning rate was set to 0.01. ... The number of Riemann integration samples in the IG was set to 50, and the number of parameter perturbations in the NG was set to 50. The variance of the corresponding perturbation noise was set to 0.2... The values reported in Table 3, Table 4 and Table 5 are the average of 1000 samples tested 5 times. And its hyperparameter, the saliency threshold, was set to [0.05, 0.1, 0.15, 0.2, 0.25, 0.3] according to [23].