reproducibilityindex.ai

Training Neural Machines with Trace-Based Supervision

Authors: Matthew Mirman, Dimitar Dimitrov, Pavle Djordjevic, Timon Gehr, Martin Vechev

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We performed a detailed experimental evaluation with NTM and NRAM machines, showing that additional supervision on the interpretable portions of these architectures leads to better convergence and generalization capabilities of the learning phase than standard training, in both noise-free and noisy scenarios.
Researcher Affiliation	Academia	1Department of Computer Science, ETH Zurich, Switzerland.
Pseudocode	No	The paper describes machine structures and equations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	All of the code, tasks and experiments are available at: https://github.com/eth-sri/ncm
Open Datasets	No	The paper refers to 'algorithmic tasks (mostly from the NTM and NRAM papers)' such as 'Flip3rd', 'Swap', and 'Merge', but does not provide concrete access information (link, DOI, or specific citation with authors/year for a dataset) for the data used in these tasks.
Dataset Splits	No	The paper mentions using 'examples of size n' for training and testing on 'size 1.5n' and '2n' examples, and states 'A maximum of 10000 samples were used for the DNGPU and 5000 for the NRAM', but it does not specify explicit percentages or sample counts for training, validation, and test splits, nor does it explicitly mention a validation set split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies	No	The paper mentions 'The DNGPU was run out of the box from the code supplied by the authors' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used.
Experiment Setup	Yes	The different supervision types are shown vertically, while the proportion of examples that receive extra subtrace supervision (density) and the extra loss term weight (λ) are shown horizontally. The best results in this case are for the read/corner type of hints for 1/2 or 1/10 of the examples, with λ {0.1, 1}.