Conditional Generative Adversarial Networks for Commonsense Machine Comprehension
Authors: Bingning Wang, Kang Liu, Jun Zhao
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show the advantage of the CGANs in discriminating sentence and achieve state-of-the-art results in commonsense story reading comprehension task compared with previous feature engineering and deep learning methods. |
| Researcher Affiliation | Academia | 1 National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China 2 University of Chinese Academy of Sciences, Beijing, 100049, China {bingning.wang, kliu, jzhao}@nlpr.ia.ac.cn |
| Pseudocode | Yes | Algorithm 1 Conditional Generative Adversarial Networks |
| Open Source Code | No | The paper does not provide any links or explicit statements about the availability of the source code for the methodology described in this paper. It only links to third-party datasets/resources. |
| Open Datasets | Yes | Recently proposed Story Cloze Test [Mostafazadeh et al., 2016] is a commonsense machine comprehension application... We pre-train our CGANs in the New York Times (NYT) news article corpus3. 3https://catalog.ldc.upenn.edu/LDC2008T19 |
| Dataset Splits | No | The paper mentions "Validation Set" in Table 1 and discusses training/testing periods, but it does not explicitly provide the specific percentages or counts for training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions general training parameters. |
| Software Dependencies | No | The paper mentions using "word2vec" but does not specify version numbers for any software libraries, frameworks, or dependencies used in their implementation. |
| Experiment Setup | Yes | All weight and attention matrices are initiated by fixing their largest singular values to 1.0. We use Adadelta with ρ = 0.999 to update parameter. We use L1 criteria with weight 1e-5 to regulate the parameter. All training process is implemented with batch size equals to 32. For the discriminator: we set the vocabulary size to 25000... The sentence GRU hidden state size is set to 128 and the document hidden state size is set to 150. For the generator: the decoder size is set to 256... ε is set to 1.0E-20. The THRESHOLD was set to 0.2. For kd and kg, we truncate their max value to 20... |