Mapping natural-language problems to formal-language solutions using structured neural representations
Authors: Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Ken Forbus, Jianfeng Gao
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | TP-N2F considerably outperforms LSTM-based seq2seq models on two benchmarks and creates new state-of-the-art results. Ablation studies show that improvements can be attributed to the use of structured TPRs explicitly in both the encoder and decoder. Analysis of the learned structures shows how TPRs enhance the interpretability of TP-N2F. The proposed TP-N2F model is evaluated on two N2F tasks, generating operation sequences to solve math problems and generating Lisp programs. In both tasks, TP-N2F achieves state-of-the-art performance. |
| Researcher Affiliation | Collaboration | 1Microsoft Research, Redmond, USA. 2Department of Computer Science, Northwestern University, Evanston, USA. 3Department of Cognitive Science, Johns Hopkins University, Baltimore, USA. |
| Pseudocode | No | The paper describes the model architecture using figures and mathematical equations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing the source code for the TP-N2F model, nor does it include a link to a code repository. |
| Open Datasets | Yes | We test TP-N2F for this task on the Math QA dataset (Amini et al., 2019). We evaluate our model on the Algo Lisp dataset for this task |
| Dataset Splits | No | The paper mentions using specific datasets (Math QA, Algo Lisp) and discusses 'full test set' and 'cleaned test set', but it does not provide explicit percentages or sample counts for the training, validation, and test splits within the main text, deferring details to an appendix. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer but does not provide specific version numbers for software dependencies or libraries used in the implementation. |
| Experiment Setup | No | The paper mentions that TP-N2F is trained using back-propagation with the Adam optimizer and teacher-forcing, but it does not provide specific hyperparameter values or detailed system-level training settings such as learning rates, batch sizes, or number of epochs within the main text. |