reproducibilityindex.ai

Neural Logic Machines

Authors: Honghua Dong, Jiayuan Mao, Tian Lin, Chong Wang, Lihong Li, Denny Zhou

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, NLMs achieve perfect generalization in a number of tasks, from relational reasoning tasks on the family tree and general graphs, to decision making tasks including sorting arrays, ﬁnding shortest paths, and playing the blocks world. Most of these tasks are hard to accomplish for neural networks or inductive logic programming alone. In this section, we show that NLM can solve a broad set of tasks, ranging from relational reasoning to decision making. Furthermore, we show that NLM trained using small-sized instances can generalize to large-sized instances. In the experiments, Softmax-Cross-Entropy loss is used for supervised learning tasks, and REINFORCE (Williams, 1992) is used for reinforcement learning tasks.
Researcher Affiliation	Collaboration	Honghua Dong 1, Jiayuan Mao 1, Tian Lin2, Chong Wang3, Lihong Li2, and Denny Zhou2 1 ITCS, IIIS, Tsinghua University {dhh14, mjy14}@mails.tsinghua.edu.cn 2 Google Inc. {tianlin,lihong,dennyzhou}@google.com 3 Byte Dance Inc. chong.wang@bytedance.com
Pseudocode	Yes	Algorithm 1: Curriculum learning guided by exams and fails. The supplementary material Appendix F also provides Python code labeled 'IMPLEMENT NLM IN TENSORFLOW', including functions like 'neural_logic_layer_breath3' which serves as a minimal implementation example.
Open Source Code	Yes	Project page: https://sites.google.com/view/neural-logic-machines. (Footnote 1) Additionally, in Appendix A.4, it states: 'more details and speciﬁc parameters used to generate the data could be found in our open source code.'
Open Datasets	No	The paper mentions generating its own training data for tasks like Family Tree, General Graph, Sorting, and Blocks World: 'We use random generation to generate training and testing data.' (Appendix A.4). While it refers to classic problems, it does not provide direct links, DOIs, specific repository names, or formal citations for the specific instances of the datasets used in their experiments. The data is generated by their code, but the dataset itself is not explicitly made available in a dedicated public repository or cited as such.
Dataset Splits	No	The paper mentions training and testing on instances of certain sizes (e.g., 'trained on instances of size 20 and tested on instances of size 20 and 100'). While it describes a 'curriculum learning guided by exams and fails' process where 'The evaluation process (exam) randomly samples examples from 3 recent lessons' (Appendix A.2), it does not define a clear, distinct validation dataset split separate from the training and testing sets needed for reproducibility.
Hardware Specification	No	The paper does not specify any particular hardware components such as GPU models, CPU types, or cloud computing instance details. It mentions training 'on a GPU' generally, but no specific models are provided.
Software Dependencies	No	The paper indicates the use of 'tensorflow as tf' in Appendix F's code snippet. However, it does not specify version numbers for TensorFlow or any other software dependencies, which are crucial for reproducibility.
Experiment Setup	Yes	In Appendix A ('Training Method and Curriculum Learning') and Appendix B ('Implementation Details and Hyper-Parameters'), the paper provides extensive details on the experimental setup, including: 'We optimize both NLM and Mem NN with Adam (Kingma & Ba (2015)) and use a learning rate of α = 0.005.', 'For all supervised learning tasks... we use Softmax-Cross Entropy as loss function and a training batch size of 4.', 'For reinforcement learning tasks... we use REINFORCE algorithm...', and Tables 3 and 4 which list hyper-parameters such as 'Depth', 'Breath', 'Step Limit', 'βinit', 'Ω', 'Epochs', and 'Hidden_dim'.