Neurocoder: General-Purpose Computation Using Stored Neural Programs

Authors: Hung Le, Svetha Venkatesh

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the flexibility of Neurocoder framework, we consider different learning paradigms: instance-based, sequential, multi-task and continual learning. We do not focus on breaking performance records by augmenting stateof-the-art models with Neurocoder. Instead, our inquiry is on re-coding feed-forward layers with the Neurocoder s programs and testing on varied data types to demonstrate its intrinsic properties, showing consistent improvement over standard backbones and methods. Our contributions are: (i) we provide a novel and efficient way to store programs/weights of the neural networks in an external memory, (ii) thanks to our general design of program memory, we can equip current neural networks with a new capability of conditional and modular computing, and (iii) we conduct experiments on various tasks, confirming the general-purpose property of our model.
Researcher Affiliation Academia Hung Le 1 Svetha Venkatesh 1 1Applied AI Institute, Deakin University, Geelong, Australia. Correspondence to: Hung Le <thai.le@deakin.edu.au>.
Pseudocode No The paper includes diagrams and high-level descriptions of processes (e.g., Figure 4 'Active program coding'), but it does not contain a formally labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code No The paper does not contain any explicit statements about releasing source code for its methodology or links to a code repository.
Open Datasets Yes We tested Neurocoder on instance-based learning through classical image classification tasks using MNIST (Le Cun et al., 1998) and CIFAR (Krizhevsky et al., 2009) datasets. ... We used the standard training and testing set of MNIST dataset. ... We used the standard training and testing sets of CIFAR datasets.
Dataset Splits No The paper mentions 'standard training and testing sets' for datasets like MNIST and CIFAR but does not provide specific details on how validation splits were generated (e.g., percentages, counts, or explicit references to standard validation splits).
Hardware Specification Yes We trained all the models using single GPU NVIDIA V100-SXM2.
Software Dependencies No The paper mentions software components like 'A3C agent' and 'LSTM' but does not provide specific version numbers for these or other underlying software dependencies (e.g., Python, PyTorch, TensorFlow versions) required for reproduction.
Experiment Setup Yes For most experiments, we use Adam optimiser with a batch size of 128. ... We controlled the number of parameters of Neurocoder, which included parameters for the Program Memory and the Program Controller by reducing the input dimension using random projection zt = xt U with U R768 200 initialised randomly and fixed during the training. ... We trained the models with RMSProp optimiser with learning rate of 10 4 and batch size of 64 to minimise the cross-entropy loss of the ground truth output and the predicted one. ... We report details of best hyper-parameters and model size for each tasks in Table 8 and 9, respectively.