reproducibilityindex.ai

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

Authors: Simon S. Du, Wei Hu, Jason D. Lee

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 4, we empirically verify the theoretical result in Section 2. We perform experiments to verify the auto-balancing properties of gradient descent in neural networks with Re LU activation.
Researcher Affiliation	Academia	Simon S. Du: Machine Learning Department, School of Computer Science, Carnegie Mellon University. Email: ssdu@cs.cmu.edu; Wei Hu: Computer Science Department, Princeton University. Email: huwei@cs.princeton.edu; Jason D. Lee: Department of Data Sciences and Operations, Marshall School of Business, University of Southern California. Email: jasonlee@marshall.usc.edu
Pseudocode	No	The paper describes algorithms but does not include any labeled 'Pseudocode' or 'Algorithm' blocks, nor are the steps formatted in a structured, code-like manner.
Open Source Code	No	The paper does not provide any specific statement about releasing source code, nor does it include a link to a code repository for the described methodology.
Open Datasets	No	The paper mentions 'Given a training dataset tpxi, yiqum i 1 Ä Rd ˆ Rp' and 'We use 1,000 data points', but it does not specify the name of a publicly available dataset, nor does it provide any link, DOI, or formal citation for accessing it.
Dataset Splits	No	The paper refers to a 'training dataset' and '1,000 data points' but does not provide any specific information about training, validation, or test dataset splits (e.g., percentages, sample counts, or references to standard splits).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or version numbers (e.g., programming languages, libraries, or frameworks with their versions) used to replicate the experiments.
Experiment Setup	Yes	We consider a 3-layer fully connected network of the form fpxq W3φp W2φp W1xqq where x P R1,000 is the input, W1 P R100ˆ1,000, W2 P R100ˆ100, W3 P R10ˆ100, and φp q is Re LU activation. We use 1,000 data points and the quadratic loss function, and run GD. We ﬁrst test a balanced initialization: W1ri, js Np0, 10 4 100 q, W2ri, js Np0, 10 4 10 q and W3ri, js Np0, 10 4q. We then test an unbalanced initialization: W1ri, js Np0, 10 4q, W2ri, js Np0, 10 4q and W3ri, js Np0, 10 4q. After 10,000 iterations we have... and step sizes t 100pt 1q}M }3{2 F.