Compositional generalization through meta sequence-to-sequence learning
Authors: Brenden M. Lake
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments |
| Researcher Affiliation | Collaboration | Brenden M. Lake New York University Facebook AI Reasearch |
| Pseudocode | No | The paper describes the architecture and processes with equations and textual descriptions but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Py Torch code is available at https://github.com/brendenlake/meta_seq2seq. |
| Open Datasets | Yes | New benchmarks have been proposed to encourage progress [10, 16, 2], including the SCAN dataset for compositional learning [16]. |
| Dataset Splits | No | The paper mentions 'training and test sets' but does not provide specific details on validation splits (e.g., percentages, sample counts, or explicit validation sets). |
| Hardware Specification | Yes | With my Py Torch implementation, it takes less than 1 hour to train meta seq2seq on SCAN using one NVIDIA Titan X GPU |
| Software Dependencies | No | The paper mentions 'Py Torch implementation' and 'ADAM optimizer [15]' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The input and output sequence encoders are two-layer bi LSTMs with m = 200 hidden units per layer, producing m dimensional embeddings. The output decoder is a two-layer LSTM also with m = 200. Dropout is applied with probability 0.5 to each LSTM and symbol embedding. ... Networks are meta-trained for 10,000 episodes with the ADAM optimizer [15]. The learning rate is reduced from 0.001 to 0.0001 halfway, and gradients with a l2-norm greater than 50 are clipped. |