Neural Status Registers
Authors: Lukas Faber, Roger Wattenhofer
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally validate the NSR in various settings. We can combine the NSR with other neural models to solve interesting problems such as piecewise-defined arithmetic, comparison of digit images, recurrent problems, or finding shortest paths in graphs. The NSR outperforms all baseline architectures, especially when it comes to extrapolating to larger numbers. 4. Experiments |
| Researcher Affiliation | Academia | Lukas Faber 1 Roger Wattenhofer 1 ETH Zurich, Switzerland. |
| Pseudocode | No | The paper describes the model mathematically and through text but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code for all experiments is available1. 1https://github.com/lukasjf/neural_status_register |
| Open Datasets | Yes | We use a standard convolutional neural network to predict the digit in the image. The NSR takes two such predictions and learns a comparison of the numbers. The entire architecture (CNN+NSR) is trained from scratch end-to-end. We create the data from normal MNIST batches |
| Dataset Splits | No | The paper defines training and testing sets for its experiments (e.g., Section 4.1 "We sample integers from [ 10, 9]... After training, we test models in an extrapolation setting"), but it does not explicitly provide details about a distinct validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'pytorch' as a base architecture in Section 4.3 and 'Adam' as an optimizer but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We supervise, in this and all subsequent experiments, with the mean absolute error and use the Adam (Kingma & Ba, 2015) optimizer with default settings. All results are averaged over 10 different seeds. We train for 50000 epochs and then test around each pivot element separately. |