Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Provable Gradient Editing of Deep Neural Networks

Authors: Zhe Tao, Aditya V Thakur

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluated Pro Grad via enforcing (i) hard Grad-CAM constraints on IMAGENET Res Net DNNs; (ii) hard Integrated Gradients constraints on Llama 3 and Qwen 3 LLMs; (iii) hard gradient constraints in training a function-approximation DNN as a proxy for safety constraints in control systems and physical invariants in scientific applications. The results highlight the unique capability of Pro Grad in enforcing hard constraints on DNN gradients.
Researcher Affiliation Academia Zhe Tao University of California, Davis Davis, CA 95616, USA EMAIL EMAIL Aditya V. Thakur University of California, Davis Davis, CA 95616, USA EMAIL
Pseudocode No The paper describes the methodology and definitions (e.g., Definition 4.4 for conditional variable gradient construction) and outlines theoretical results (Theorem 4.3 and 4.5), but it does not present these in a structured pseudocode or algorithm block format.
Open Source Code No Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: The experiments Section 5.1 5.2 use open-access benchmarks. For the implementation of our tool, we could not make the code open access due to ongoing IP restrictions.
Open Datasets Yes We evaluate Pro Grad by enforcing (i) hard Grad-CAM constraints on IMAGENET Res Net DNNs; (ii) hard Integrated Gradients constraints on Llama 3 and Qwen 3 LLMs; (iii) hard gradient constraints in training a function-approximation DNN, which acts as a proxy for safety constraints in control systems and physical invariants in scientific applications. The results highlight the unique capability of Pro Grad in enforcing hard constraints on DNN gradients and gradient-based explanations. ... The edit set consists of misclassified images that have deviated Grad-CAM attributions for their expected class. These images are from the IMAGENET-C dataset [11] ... These sentences are misclassified samples from the Stanford Sentiment Treebank 2 (SST-2)[36] training set ... The Res Net152 DNN is from torchvision [18] with the pre-trained weights Res Net152_Weights.IMAGENET1K_V1... The Llama-3.2-1b-Instruct LLM for editing is from https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct; The teacher Llama-3.1-8b-Instruct LLM for computing the expected IG is from https://huggingface. co/meta-llama/Llama-3.1-8B-Instruct. The Qwen3-1.7B LLM for editing is from https: //huggingface.co/Qwen/Qwen3-1.7B; The teacher Qwen3-8B LLM for computing the expected IG is from https://huggingface.co/Qwen/Qwen3-8B.
Dataset Splits Yes For each DNN, we construct two such edit sets: (i) 100 images from the first 50 classes with ε=1e-2; (ii) 1,000 images from the first 200 classes with ε=5e-2. ... For each LLM, we construct two such edit sets: (i) the first 100 misclassified sentences from SST-2 with ε=5e-2; (ii) the first 200 misclassified sentences from SST-2 with ε=1e-1. ... Over the domain [0, 3π], we uniformly sample 2,048 points as the dataset D.
Hardware Specification Yes All experiments were run on a machine with dual Intel Xeon Platinum 8362 Processors, 32-Core 2.8GHz with 1.5 TB of memory, SSD, and NVIDIA H100 GPU with 80 GB of GPU memory running Ubuntu 22.04.
Software Dependencies No We have implemented Pro Grad in Py Torch [26] and use Gurobi [10] as the LP solver. ... We use trl [42] with a learning rate from {1e-3, 1e-4}, epochs from {1, 2, 3, 4, 5}, bfloat16 precision, and trainable parameters from the language modelling head (lm_head) only or all layers.
Experiment Setup Yes Setup details. For the two edit sets, (i) the 100 images with ε=1e-2 edit set consists of the first 100 images from the first 50 classes of the IMAGENET-C dataset with Gaussian noise and severity 1... For each gradient-descent-based baseline, we perform a grid search over the learning rate, batch size, number of epochs, and transfer weight hyperparameters, with a time limit of 12 hours, and report the best results with the best constraint satisfaction rate. ... Setup for SFT baselines. We use trl [42] with a learning rate from {1e-3, 1e-4}, epochs from {1, 2, 3, 4, 5}, bfloat16 precision, and trainable parameters from the language modelling head (lm_head) only or all layers. ... Setup for GD baseline. We use the Adam optimizer with learning rate 0.01, and exponential learning rate decay with γ = 0.997. We train the DNN for 300 epochs with a batch size of 64, using the following loss function to learn both the DNN output and the gradient: L(x) = MSE N(x), f(x) + MSE d dxf(x)