Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Reparameterization through Spatial Gradient Scaling
Authors: Alexander Detkov, Mohammad Salameh, Muhammad Fetrat, Jialin Zhang, Robin Luwei, SHANGLING JUI, Di Niu
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on CIFAR-10, CIFAR-100, and Image Net show that without searching for reparameterized structures, our proposed scaling method outperforms the state-of-the-art reparameterization strategies at a lower computational cost. The code is available at https://github.com/Ascend-Research/Reparameterization. |
| Researcher Affiliation | Collaboration | Alexander Detkov1, , Mohammad Salameh2, , Muhammad Fetrat Qharabagh1,*, , Jialin Zhang3, Wei Lui2, Shangling Jui3, Di Niu1 1University of Alberta, 2Huawei Technologies, 3Huawei Kirin Solutions |
| Pseudocode | Yes | An overview of the SGS framework is given as pseudo-code in Appendix A.5, and details can be found in the corresponding open-source code. |
| Open Source Code | Yes | The code is available at https://github.com/Ascend-Research/Reparameterization. |
| Open Datasets | Yes | Experiments on CIFAR-10, CIFAR-100, and Image Net show that without searching for reparameterized structures, our proposed scaling method outperforms the state-of-the-art reparameterization strategies at a lower computational cost. |
| Dataset Splits | Yes | We search for k on CIFAR100 and use the optimal for experiments on CIFAR10 and Image Net. We perform a grid search on CIFAR100 and VGG-16 over k {2, 3, 4, 5, 6, 7} using 20% of the training set for validation. |
| Hardware Specification | Yes | Training is done on a single NVIDIA Tesla V100 GPU. ... on 8 NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions 'Py Torch defaults' for optimizer settings but does not specify version numbers for PyTorch or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | We train VGG-16 on CIFAR-{10,100} for 600 epochs with a batch size of 128, cosine annealing scheduler with an initial learning rate of 0.1, and SGD optimizer with momentum 0.9 and weight decay 1e-4. We update our spatial gradient scalings every 30 epochs using 20 random batches from the training set. We add a 1 epoch warm-up period at the start of training before generating our gradient scalings. |