reproducibilityindex.ai

Coupling Distributed and Symbolic Execution for Natural Language Queries

Authors: Lili Mou, Zhengdong Lu, Hang Li, Zhi Jin

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our approach signiﬁcantly outperforms both distributed and symbolic executors, exhibiting high accuracy, high learning efﬁciency, high execution efﬁciency, and high interpretability.
Researcher Affiliation	Collaboration	1Key Laboratory of High Conﬁdence Software Technologies (Peking University), Mo E; Software Institute, Peking University, China 2Deeply Curious.ai 3Noah s Ark Lab, Huawei Technologies.
Pseudocode	No	The paper describes the primitive operators and the symbolic executor's process in narrative text and tables but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states: 'The data are available at out project website4; the code for data generation can also be downloaded to facilitate further development of the dataset.' This refers to data generation code, not the source code for the proposed methodology.
Open Datasets	Yes	The data are available at out project website4; the code for data generation can also be downloaded to facilitate further development of the dataset. https://sites.google.com/site/coupleneuralsymbolic/
Dataset Splits	Yes	The dataset comprises 25k different tables and queries for training; validation and test sets contain 10k samples, respectively, and do not overlap with the training data.
Hardware Specification	Yes	All neural networks are implemented in Theano with a TITAN Black GPU and Xeon e7-4820v2 (8-core) CPU; symbolic execution is assessed in C++ implementation.
Software Dependencies	No	The paper mentions 'Theano' and 'C++ implementation' as software used, but it does not specify any version numbers for these or any other software dependencies.
Experiment Setup	Yes	The dimensions of all layers were in the range of 20 50; the learning algorithm was Ada Delta with default hyperparameters. For the pretraining of the symbolic executor, we applied maximum likelihood estimation for 40 epochs to column selection with labels predicted by the distributed executor. We then used the REINFORCE algorithm to improve the policy, where we generated 10 action samples for each data point with the exploration probability ϵ being 0.1. When feeding back to the distributed model, we chose λ from {0.1, 0.5, 1} by validation to balance denotation error and ﬁeld attention error.