Conditional Generative Adversarial Networks for Commonsense Machine Comprehension

Authors: Bingning Wang, Kang Liu, Jun Zhao

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show the advantage of the CGANs in discriminating sentence and achieve state-of-the-art results in commonsense story reading comprehension task compared with previous feature engineering and deep learning methods.
Researcher Affiliation Academia 1 National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China 2 University of Chinese Academy of Sciences, Beijing, 100049, China {bingning.wang, kliu, jzhao}@nlpr.ia.ac.cn
Pseudocode Yes Algorithm 1 Conditional Generative Adversarial Networks
Open Source Code No The paper does not provide any links or explicit statements about the availability of the source code for the methodology described in this paper. It only links to third-party datasets/resources.
Open Datasets Yes Recently proposed Story Cloze Test [Mostafazadeh et al., 2016] is a commonsense machine comprehension application... We pre-train our CGANs in the New York Times (NYT) news article corpus3. 3https://catalog.ldc.upenn.edu/LDC2008T19
Dataset Splits No The paper mentions "Validation Set" in Table 1 and discusses training/testing periods, but it does not explicitly provide the specific percentages or counts for training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It only mentions general training parameters.
Software Dependencies No The paper mentions using "word2vec" but does not specify version numbers for any software libraries, frameworks, or dependencies used in their implementation.
Experiment Setup Yes All weight and attention matrices are initiated by fixing their largest singular values to 1.0. We use Adadelta with ρ = 0.999 to update parameter. We use L1 criteria with weight 1e-5 to regulate the parameter. All training process is implemented with batch size equals to 32. For the discriminator: we set the vocabulary size to 25000... The sentence GRU hidden state size is set to 128 and the document hidden state size is set to 150. For the generator: the decoder size is set to 256... ε is set to 1.0E-20. The THRESHOLD was set to 0.2. For kd and kg, we truncate their max value to 20...