Principled Weight Initialization for Hypernetworks
Authors: Oscar Chang, Lampros Flokas, Hod Lipson
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We derive novel weight initialization formulae for hypernetworks in Section 4, empirically evaluate our proposed methods in Section 5, and finally conclude in Section 6. |
| Researcher Affiliation | Academia | Oscar Chang, Lampros Flokas, Hod Lipson Columbia University New York, NY 10027 {oscar.chang, lf2540, hod.lipson}@columbia.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for source code related to the methodology described. |
| Open Datasets | Yes | As an illustrative first experiment, we train a feedforward network with five hidden layers (500 hidden units), a hyperbolic tangent activation function, and a softmax output layer, on MNIST across four different settings |
| Dataset Splits | No | The paper specifies training and testing, but does not provide explicit details about dataset validation splits (e.g., percentages, sample counts, or methodology for a separate validation set). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions "Py Torch and Chainer" but does not provide specific version numbers for these or any other ancillary software components. |
| Experiment Setup | Yes | The networks were trained on MNIST for 30 epochs with batch size 10 using a learning rate of 0.0005 for the hypernets and 0.01 for the classical network. |