Mapping natural-language problems to formal-language solutions using structured neural representations

Authors: Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Ken Forbus, Jianfeng Gao

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental TP-N2F considerably outperforms LSTM-based seq2seq models on two benchmarks and creates new state-of-the-art results. Ablation studies show that improvements can be attributed to the use of structured TPRs explicitly in both the encoder and decoder. Analysis of the learned structures shows how TPRs enhance the interpretability of TP-N2F. The proposed TP-N2F model is evaluated on two N2F tasks, generating operation sequences to solve math problems and generating Lisp programs. In both tasks, TP-N2F achieves state-of-the-art performance.
Researcher Affiliation Collaboration 1Microsoft Research, Redmond, USA. 2Department of Computer Science, Northwestern University, Evanston, USA. 3Department of Cognitive Science, Johns Hopkins University, Baltimore, USA.
Pseudocode No The paper describes the model architecture using figures and mathematical equations but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing the source code for the TP-N2F model, nor does it include a link to a code repository.
Open Datasets Yes We test TP-N2F for this task on the Math QA dataset (Amini et al., 2019). We evaluate our model on the Algo Lisp dataset for this task
Dataset Splits No The paper mentions using specific datasets (Math QA, Algo Lisp) and discusses 'full test set' and 'cleaned test set', but it does not provide explicit percentages or sample counts for the training, validation, and test splits within the main text, deferring details to an appendix.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments.
Software Dependencies No The paper mentions using the Adam optimizer but does not provide specific version numbers for software dependencies or libraries used in the implementation.
Experiment Setup No The paper mentions that TP-N2F is trained using back-propagation with the Adam optimizer and teacher-forcing, but it does not provide specific hyperparameter values or detailed system-level training settings such as learning rates, batch sizes, or number of epochs within the main text.