Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Authors: Mingze Wang, Zeping Min, Lei Wu
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our theoretical findings, we present both synthetic and real-world experiments. Notably, PRGD also shows promise in enhancing the generalization performance when applied to linearly non-separable datasets and deep neural networks. |
| Researcher Affiliation | Academia | 1School of Mathematical Sciences, Peking University, Beijing, China 2Center for Machine Learning Research, Peking University, Beijing, China. |
| Pseudocode | Yes | Algorithm 1 Progressive Rescaling Gradient Descent (PRGD) |
| Open Source Code | No | The paper does not provide a specific link or explicit statement about releasing the source code for the methodology described. |
| Open Datasets | Yes | Specifically, we employ the digit datasets from Sklearn, which are image classification tasks with d = 64, n = 300. ... VGG-16 network (Simonyan & Zisserman, 2015) on the full CIFAR-10 dataset (Krizhevsky & Hinton, 2009) |
| Dataset Splits | No | The paper mentions training and testing datasets, but it does not explicitly specify a validation dataset split or how validation was performed. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'VGG architecture implemented in PyTorch' but does not specify version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The network was trained using a batch size of 64... We used a base learning rate of 1e-3, a momentum of 0.9, and a weight decay of 5e-4. ... We configured PRGD with Tk = 3000 * 2 + k * 3000 * 3, Rk = min(R0 * 2 + k * 3000 * 0.2, 1000). |