Weak Supervision Enhanced Generative Network for Question Generation
Authors: Yutong Wang, Jiyuan Zheng, Qijiong Liu, Zhou Zhao, Jun Xiao, Yueting Zhuang
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct detailed experiments to demonstrate the comparative performance of our approach. 4 Experiments 4.1 Datasets 4.2 Implementation Details 4.3 Baselines 4.4 Automatic Evaluation 4.5 Human Evaluation |
| Researcher Affiliation | Academia | Yutong Wang1 , Jiyuan Zheng1 , Qijiong Liu1 , Zhou Zhao1 , Jun Xiao1 and Yueting Zhuang1 1College of Computer Science and Technology, Zhejiang University, China {ytwang, jiyuanz, lqj, zhaozhou, yzhuang}@zju.edu.cn, junx@cs.zju.edu.cn |
| Pseudocode | No | The paper describes the model architecture and processes in text and with diagrams, but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | The MS MARCO dataset is a large scale dataset collected from Bing. This dataset contains 1,010,916 questions and 8,841,823 related passages extracted from 3,563,535 web documents. ... The SQu AD dataset [Rajpurkar et al., 2016] is one of the most influential reading comprehension datasets which contains over 100k questions of 536 Wikipedia articles created by crowd-workers and the answers are continuous spans in the passages. |
| Dataset Splits | Yes | The whole dataset(train set and dev set) is randomly divided into a training set (80%), a development set (10%) and a test set (10%) at the article level. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions using GloVe pretrained embeddings and refers to standard components like Adam optimizer and Multi-head Attention, but does not provide specific version numbers for software dependencies such as deep learning frameworks or libraries. |
| Experiment Setup | Yes | For model hyperparameters, we have 4 convolutional filters with kernel size 7 for all convolutional blocks and all the self attention blocks are Multi-head Attention [Vaswani et al., 2017] with 8 attention heads. We adopt Adam optimizer with a learning rate of 0.001, β1 = 0.9 and β2 = 0.999 and batch size is set to 16. we train the model for a maximum of 20 epochs and use early stopping with the patience set to 5 epochs according to the BLEU-4 score on validation set. At the validation and test time, we use beam search with beam size set to 5. |