On the Modularity of Hypernetworks
Authors: Tomer Galanti, Lior Wolf
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments, In Fig. 2 we observe that the distance between f1 and f2 tends to be significantly smaller than their distances from y., Synthetic Experiments We experimented with the following class of target functions., Experiments on Real-world Datasets To validate the prediction in Sec. 4.1, we experimented with comparing the ability of hypernetworks and embedding methods of similar complexities in approximating the target function. We experimented with the MNIST [19] and CIFAR10 datasets [17] on two self-supervised learning tasks: predicting image rotations, described below and image colorization (Sec. 1.3 in the appendix). |
| Researcher Affiliation | Collaboration | Tomer Galanti School of Computer Science Tel Aviv University tomerga2@tauex.tau.ac.il Lior Wolf Facebook AI Research (FAIR) & Tel Aviv University wolf@fb.com |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-sourcing of the code for the methodology described. |
| Open Datasets | Yes | Three input spaces are considered: (i) the CIFAR10 dataset, (ii) the MNIST dataset and (iii) the set [ 1, 1]28 28. and We experimented with the MNIST [19] and CIFAR10 datasets [17] and Yann Le Cun and Corinna Cortes. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/, 2010. and Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. |
| Dataset Splits | No | The paper mentions using 30000 samples for training and evaluating on test data, but does not specify explicit training/validation/test dataset splits with percentages or counts, nor does it refer to predefined splits from citations. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components and methods like 'SGD method', 'Re LU activation', 'MSE loss', and 'negative log loss' but does not provide specific version numbers for any libraries or frameworks used (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | The training was done using the SGD method with a learning rate µ = 0.01 and momentum γ = 0.5, for 50 epochs. and The samples are divided into batches of size 200 and the optimization is done using the SGD method with a learning rate µ = 0.01. and The networks are trained with the negative log loss for 10 epochs using SGD with a learning rate of µ = 0.01. and primary-networks g and q to be neural networks with two layers of dimensions din 10 1 and Re LU activation within the hidden layer. |