On The Specialization of Neural Modules
Authors: Devon Jarvis, Richard Klein, Benjamin Rosman, Andrew M Saxe
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we confirm that the theoretical results in our tractable setting generalize to more complex datasets and non-linear architectures. |
| Researcher Affiliation | Academia | 1School of Computer Science and Applied Mathematics, University of the Witwatersrand 2Gatsby Computational Neuroscience Unit & Sainsbury Wellcome Centre, UCL 3CIFAR Azrieli Global Scholar, CIFAR {devon.jarvis,richard.klein,benjamin.rosman1}@wits.ac.za a.saxe@ucl.ac.uk |
| Pseudocode | No | The paper contains mathematical derivations and equations but no explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Full code for reproducing all figures can be found at: https://github.com/raillab/specialization_of_neural_modules. |
| Open Datasets | Yes | To evaluate how well our results generalize to non-linear networks and more complex datasets, in this section we train a deep Convolutional Neural Network (CNN) to learn a compositional variant of MNIST (CMNIST) shown in Figure 4a. |
| Dataset Splits | No | The paper mentions training and testing sets, but does not explicitly describe a validation set or its split. For example, in Section 7, it discusses "normalized training loss (b) and test loss (c)". |
| Hardware Specification | No | The paper states, "All experiments are run using the Jax library (Bradbury et al., 2018)," which indicates software used but provides no specific hardware details like GPU/CPU models. |
| Software Dependencies | No | The paper mentions "Jax library (Bradbury et al., 2018)" and "Python+Num Py programs" but does not specify version numbers for these software components. |
| Experiment Setup | Yes | Table 3: Table showing the hyper-parameters used for the CMNIST experiments. Hyper-parameter Value Step Size 2e-3 Batch Size 16 Initialization Variance 0.01 |