reproducibilityindex.ai

Compositional Generalization via Neural-Symbolic Stack Machines

Authors: Xinyun Chen, Chen Liang, Adams Wei Yu, Dawn Song, Denny Zhou

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Ne SS on four benchmarks that require compositional generalization: (1) the SCAN benchmark discussed above; (2) the task of few-shot learning of compositional instructions [28]; (3) the compositional machine translation task [26]; and (4) the context-free grammar parsing tasks [8].
Researcher Affiliation	Collaboration	Xinyun Chen UC Berkeley xinyun.chen@berkeley.edu; Chen Liang, Adams Wei Yu Google Brain {crazydonkey,adamsyuwei}@google.com; Dawn Song UC Berkeley dawnsong@cs.berkeley.edu; Denny Zhou Google Brain dennyzhou@google.com
Pseudocode	No	The paper describes the instruction semantics of the stack machine in Table 1 and provides an illustrative example in Figure 1, but it does not include a formal pseudocode block or algorithm block for the overall Ne SS system or its training procedure.
Open Source Code	No	The paper does not provide an explicit statement about releasing open-source code for its methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We evaluate Ne SS on four benchmarks that require compositional generalization: (1) the SCAN benchmark discussed above; (2) the task of few-shot learning of compositional instructions [28]; (3) the compositional machine translation task [26]; and (4) the context-free grammar parsing tasks [8].
Dataset Splits	Yes	Evaluation setup. Similar to prior work [27, 16, 38], we evaluate the following four settings. (1) Length generalization: the output sequences in the training set include at most 22 actions, while the output lengths in the test set are between 24 and 48. (4) Simple split: randomly split samples into training and test sets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or specific cloud instance types used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, or other library versions) that were used in the experiments.
Experiment Setup	No	The paper states, 'We present the setup and key results below, and defer more experimental details to the supplementary material,' indicating that detailed experimental setup information, such as hyperparameters, is not included in the main text provided.