reproducibilityindex.ai

Neural-Symbolic Recursive Machine for Systematic Generalization

Authors: Qing Li, Yixin Zhu, Yitao Liang, Ying Nian Wu, Song-Chun Zhu, Siyuan Huang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate NSR s efficacy across four challenging benchmarks designed to probe systematic generalization capabilities: SCAN for semantic parsing, PCFG for string manipulation, HINT for arithmetic reasoning, and a compositional machine translation task. The results affirm NSR s superiority over contemporary neural and hybrid models in terms of generalization and transferability.
Researcher Affiliation	Collaboration	Qing Li1, Yixin Zhu3, Yitao Liang1,3, Ying Nian Wu2, Song-Chun Zhu1,3, Siyuan Huang1 1National Key Laboratory of General Artificial Intelligence, BIGAI 2Department of Statistics, UCLA 3Institute for Artificial Intelligence, Peking University
Pseudocode	Yes	Algorithm A1: Learning by Deduction-Abduction
Open Source Code	No	The paper includes a project page link (https://liqing-ustc.github.io/NSR) and mentions adapting Dream Coder (Ellis et al., 2021) with a linked GitHub repository (https://github.com/ellisk42/ec) in footnote 1, but it does not contain an unambiguous statement that the authors' own source code for the methodology described in this paper is openly released.
Open Datasets	Yes	Our evaluation of the NSR s capabilities in systematic generalization extends across three distinct benchmarks: (i) SCAN (Lake and Baroni, 2018), (ii) PCFG (Hupkes et al., 2020), and (iii) HINT (Li et al., 2023b). Furthermore, we explore NSR s performance on a compositional machine translation task (Lake and Baroni, 2018).
Dataset Splits	Yes	Following established studies (Lake, 2019; Gordon et al., 2019; Chen et al., 2020), we assess NSR using four splits: (i) SIMPLE, where the dataset is randomly divided into training and test sets; (ii) LENGTH, with training on output sequences up to 22 actions and testing on sequences from 24 to 48 actions; (iii) JUMP, training excluding the jump command mixed with other primitives, tested on combinations including jump ; and (iv) AROUND RIGHT, excluding around right from training but testing on combinations derived from the separately trained around and right.
Hardware Specification	Yes	All training can be done using a single NVIDIA Ge Force RTX 3090Ti under 24 hours.
Software Dependencies	No	The paper mentions using 'Adam optimizer (Kingma and Ba, 2015)' and adapting 'Dream Coder (Ellis et al., 2021)' but does not provide specific version numbers for software dependencies like programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	In the dependency parser, the token embeddings have a dimension of 50, the hidden dimension of the transition classifier is 200, and we use a dropout of 0.5. For the program induction, we adopt the default setting in Dream Coder (Ellis et al., 2021). For learning NSR, both the Res Net-18 and the dependency parser are trained by the Adam optimizer (Kingma and Ba, 2015) with a learning rate of 10 4. NSR are trained for 100 epochs for all datasets.