Sequence-to-Sequence Learning with Latent Neural Grammars

Authors: Yoon Kim

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply this latent neural grammar to various domains a diagnostic language navigation task designed to test for compositional generalization (SCAN), style transfer, and small-scale machine translation and find that it performs respectably compared to standard baselines.Table 1: Accuracy on the SCAN dataset splits compared to previous work.Table 3: Results on the hard style transfer tasks from the Style PTB dataset [78].Table 4: Results on English French machine translation.
Researcher Affiliation Collaboration Yoon Kim MIT CSAIL yoonkim@mit.eduMuch of the work was completed while the author was at MIT-IBM Watson AI.
Pseudocode No The paper describes algorithmic steps verbally (e.g., 'inside algorithm') but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/yoonkim/neural-qcfg.
Open Datasets Yes We first experiment on SCAN [68], a diagnostic dataset... and We next apply our approach on style transfer on English utilizing the Style PTB dataset from Lyu et al. [78]. and Our final experiment is on a small-scale English-French machine translation dataset from Lake and Baroni [68].
Dataset Splits Yes As the original dataset does not provide official splits, we randomly split the dataset into 6073 examples for training (1000 of which is the i am daxy example), 631 examples for validation, and 583 for test.
Hardware Specification Yes Indeed, on realistic machine translation datasets with longer sequences we quickly ran into memory issues when running the model on just a single example, even with a multi-GPU implementation of the inside algorithm distributed over four 32GB GPUs.
Software Dependencies No The paper mentions 'Our implementation uses the Torch-Struct library [93].' but does not specify version numbers for Torch-Struct or any other software dependencies.
Experiment Setup Yes We set |N| = 10 and |P| = 1, and place two additional restrictions on the rule set.See Appendix A.3.1 for the full experimental setup and hyperparameters.