Merging Weak and Active Supervision for Semantic Parsing
Authors: Ansong Ni, Pengcheng Yin, Graham Neubig8536-8543
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of our method on two different datasets. Experiments on the Wiki SQL show that by annotating only 1.8% of examples, we improve over a state-of-the-art weakly-supervised baseline by 6.4%, achieving an accuracy of 79.0%, which is only 1.3% away from the model trained with full supervision. Experiments on Wiki Table Questions with human annotators show that our method can improve the performance with only 100 active queries, especially for weakly-supervised parsers learnt from a cold start. |
| Researcher Affiliation | Academia | Ansong Ni, Pengcheng Yin, Graham Neubig Carnegie Mellon University {ansongn, pcyin, gneubig}@cs.cmu.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at https://github.com/niansong1996/wassp |
| Open Datasets | Yes | Dataset: We evaluate the performance of WASSP on two different datasets: Wiki SQL (Zhong, Xiong, and Socher 2017) and Wiki Table Questions (Pasupat and Liang 2015). |
| Dataset Splits | Yes | a single model (i.e. without ensemble) can reach an execution accuracy of 72.4% and 72.6% on the dev and test set, respectively. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory, or cloud instance types) for running its experiments. |
| Software Dependencies | No | The paper mentions using 'neural symbolic machines (NSM)' and 'memory-augmented policy optimization (MAPO)' but does not provide specific version numbers for these or any other software libraries or frameworks. |
| Experiment Setup | Yes | Training Procedure: First we follow the procedure of (Liang et al. 2018) to train NSM with MAPO on both Wiki SQL and Wiki Table Questions datasets with the same set of hyperparameters as used in the original paper. Then for Wiki SQL, we run WASSP for 3 iterations with query budget 1,000 or more and only run for one iteration with smaller budget. For each iteration, the model queries for extra supervision and then it is trained for another 5K steps. The query budget is evenly distributed to these 3 iterations and limited by the total amount. For Wiki Table Questions, we simply run one such iteration (due to limit number of annotations obtained) but let it train for 50K steps with human annotated MRs. |