Compositional generalization through meta sequence-to-sequence learning

Authors: Brenden M. Lake

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments
Researcher Affiliation Collaboration Brenden M. Lake New York University Facebook AI Reasearch
Pseudocode No The paper describes the architecture and processes with equations and textual descriptions but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Py Torch code is available at https://github.com/brendenlake/meta_seq2seq.
Open Datasets Yes New benchmarks have been proposed to encourage progress [10, 16, 2], including the SCAN dataset for compositional learning [16].
Dataset Splits No The paper mentions 'training and test sets' but does not provide specific details on validation splits (e.g., percentages, sample counts, or explicit validation sets).
Hardware Specification Yes With my Py Torch implementation, it takes less than 1 hour to train meta seq2seq on SCAN using one NVIDIA Titan X GPU
Software Dependencies No The paper mentions 'Py Torch implementation' and 'ADAM optimizer [15]' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The input and output sequence encoders are two-layer bi LSTMs with m = 200 hidden units per layer, producing m dimensional embeddings. The output decoder is a two-layer LSTM also with m = 200. Dropout is applied with probability 0.5 to each LSTM and symbol embedding. ... Networks are meta-trained for 10,000 episodes with the ADAM optimizer [15]. The learning rate is reduced from 0.001 to 0.0001 halfway, and gradients with a l2-norm greater than 50 are clipped.