Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling

Authors: Mingze Wang, Zeping Min, Lei Wu

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate our theoretical findings, we present both synthetic and real-world experiments. Notably, PRGD also shows promise in enhancing the generalization performance when applied to linearly non-separable datasets and deep neural networks.
Researcher Affiliation Academia 1School of Mathematical Sciences, Peking University, Beijing, China 2Center for Machine Learning Research, Peking University, Beijing, China.
Pseudocode Yes Algorithm 1 Progressive Rescaling Gradient Descent (PRGD)
Open Source Code No The paper does not provide a specific link or explicit statement about releasing the source code for the methodology described.
Open Datasets Yes Specifically, we employ the digit datasets from Sklearn, which are image classification tasks with d = 64, n = 300. ... VGG-16 network (Simonyan & Zisserman, 2015) on the full CIFAR-10 dataset (Krizhevsky & Hinton, 2009)
Dataset Splits No The paper mentions training and testing datasets, but it does not explicitly specify a validation dataset split or how validation was performed.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'VGG architecture implemented in PyTorch' but does not specify version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes The network was trained using a batch size of 64... We used a base learning rate of 1e-3, a momentum of 0.9, and a weight decay of 5e-4. ... We configured PRGD with Tk = 3000 * 2 + k * 3000 * 3, Rk = min(R0 * 2 + k * 3000 * 0.2, 1000).