Sequence-to-Sequence Learning with Latent Neural Grammars
Authors: Yoon Kim
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply this latent neural grammar to various domains a diagnostic language navigation task designed to test for compositional generalization (SCAN), style transfer, and small-scale machine translation and find that it performs respectably compared to standard baselines.Table 1: Accuracy on the SCAN dataset splits compared to previous work.Table 3: Results on the hard style transfer tasks from the Style PTB dataset [78].Table 4: Results on English French machine translation. |
| Researcher Affiliation | Collaboration | Yoon Kim MIT CSAIL yoonkim@mit.eduMuch of the work was completed while the author was at MIT-IBM Watson AI. |
| Pseudocode | No | The paper describes algorithmic steps verbally (e.g., 'inside algorithm') but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/yoonkim/neural-qcfg. |
| Open Datasets | Yes | We first experiment on SCAN [68], a diagnostic dataset... and We next apply our approach on style transfer on English utilizing the Style PTB dataset from Lyu et al. [78]. and Our final experiment is on a small-scale English-French machine translation dataset from Lake and Baroni [68]. |
| Dataset Splits | Yes | As the original dataset does not provide official splits, we randomly split the dataset into 6073 examples for training (1000 of which is the i am daxy example), 631 examples for validation, and 583 for test. |
| Hardware Specification | Yes | Indeed, on realistic machine translation datasets with longer sequences we quickly ran into memory issues when running the model on just a single example, even with a multi-GPU implementation of the inside algorithm distributed over four 32GB GPUs. |
| Software Dependencies | No | The paper mentions 'Our implementation uses the Torch-Struct library [93].' but does not specify version numbers for Torch-Struct or any other software dependencies. |
| Experiment Setup | Yes | We set |N| = 10 and |P| = 1, and place two additional restrictions on the rule set.See Appendix A.3.1 for the full experimental setup and hyperparameters. |