Training Neural Machines with Trace-Based Supervision
Authors: Matthew Mirman, Dimitar Dimitrov, Pavle Djordjevic, Timon Gehr, Martin Vechev
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed a detailed experimental evaluation with NTM and NRAM machines, showing that additional supervision on the interpretable portions of these architectures leads to better convergence and generalization capabilities of the learning phase than standard training, in both noise-free and noisy scenarios. |
| Researcher Affiliation | Academia | 1Department of Computer Science, ETH Zurich, Switzerland. |
| Pseudocode | No | The paper describes machine structures and equations but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | All of the code, tasks and experiments are available at: https://github.com/eth-sri/ncm |
| Open Datasets | No | The paper refers to 'algorithmic tasks (mostly from the NTM and NRAM papers)' such as 'Flip3rd', 'Swap', and 'Merge', but does not provide concrete access information (link, DOI, or specific citation with authors/year for a dataset) for the data used in these tasks. |
| Dataset Splits | No | The paper mentions using 'examples of size n' for training and testing on 'size 1.5n' and '2n' examples, and states 'A maximum of 10000 samples were used for the DNGPU and 5000 for the NRAM', but it does not specify explicit percentages or sample counts for training, validation, and test splits, nor does it explicitly mention a validation set split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions 'The DNGPU was run out of the box from the code supplied by the authors' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used. |
| Experiment Setup | Yes | The different supervision types are shown vertically, while the proportion of examples that receive extra subtrace supervision (density) and the extra loss term weight (λ) are shown horizontally. The best results in this case are for the read/corner type of hints for 1/2 or 1/10 of the examples, with λ {0.1, 1}. |