Grammar-Based Grounded Lexicon Learning
Authors: Jiayuan Mao, Freda Shi, Jiajun Wu, Roger Levy, Josh Tenenbaum
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate G2L2 on two domains: visual reasoning and language-driven navigation. Results show that G2L2 can generalize from small amounts of data to novel compositions of words. and We evaluate G2L2 on two domains: visual reasoning in CLEVR [21] and language-driven navigation in SCAN [25]. Beyond the grounding accuracy, we also evaluate the compositional generalizability and data efficiency, comparing G2L2 with end-to-end neural models and modular neural networks. |
| Researcher Affiliation | Academia | Jiayuan Mao MIT Haoyue Shi TTIC Jiajun Wu Stanford University Roger P. Levy MIT Joshua B. Tenenbaum MIT |
| Pseudocode | Yes | Algorithm 1 The CKY-E2 algorithm. |
| Open Source Code | Yes | Project page: http://g2l2.csail.mit.edu. and Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | We evaluate G2L2 on two domains: visual reasoning in CLEVR [21] and language-driven navigation in SCAN [25]. |
| Dataset Splits | Yes | Since CLEVR does not provide test set annotations, for all models, we held out 10% of the training data for model development and test them on the CLEVR validation split. |
| Hardware Specification | No | The main paper does not contain specific hardware details for running experiments. The checklist states, “Details can be found in the supplementary material.” |
| Software Dependencies | No | The main paper does not provide specific ancillary software details with version numbers. The checklist indicates that more detailed information might be in the supplementary material. |
| Experiment Setup | Yes | We train different models with either 10% or 100% of the training data and evaluate them on the validation set. and Instead of using manually defined heuristics for curriculum learning or self-paced learning as in previous works [28, 26], we employ a curriculum learning setup that is simply based on sentence length: we gradually add longer sentences into the training set. and We tuned the hidden size (i.e., the dimension of intermediate token representations) within {100, 200, 400}, as well as the number of layers (for both the encoder and the decoder) from {2, 4, 8}. |