Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Authors: Hancheng Min, Enrique Mallada, Rene Vidal
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on the MNIST dataset illustrate our theoretical findings. |
| Researcher Affiliation | Academia | Hancheng Min University of Pennsylvania hanchmin@seas.upenn.edu Enrique Mallada Johns Hopkins University mallada@jhu.edu René Vidal University of Pennsylvania vidalr@seas.upenn.edu |
| Pseudocode | No | The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Numerical experiments on the MNIST dataset illustrate our theoretical findings. |
| Dataset Splits | No | The paper mentions using the MNIST dataset for numerical experiments but does not provide specific details on training, validation, or test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers (e.g., programming languages, libraries, frameworks, or solvers) used for the experiments. |
| Experiment Setup | Yes | We build a two-layer Re LU network with h = 50 neurons and initialize all entries of the weights as [W]ij i.i.d. N 0, α2 , vj i.i.d. N 0, α2 , i [n], j [h] with α = 10 6. Then we run gradient descent on both W and v with step size η = 2 10 3. |