reproducibilityindex.ai

Guided Generation of Cause and Effect

Authors: Zhongyang Li, Xiao Ding, Ting Liu, J. Edward Hu, Benjamin Van Durme

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate through carefully designed human evaluation by comparing outputs from various baselines and our proposed model, ﬁnding that our model s outputs are preferred. We further demonstrate the usefulness of our new resource by taking a recent state-of-the-art causal reasoning system and boosting its results on the COPA test set by 3 points
Researcher Affiliation	Academia	1Harbin Institute of Technology, China 2Johns Hopkins University, USA
Pseudocode	Yes	Algorithm 1 Decoding with Disjunctive Positive Constraints. We consider the generation of one sentence with a beam size of 1 for simplicity. Note that while a beam size of 1 reduces the constrained beam search, the handling of DPC is not affected.
Open Source Code	Yes	Our models and resources are made publicly available.1 1http://nlp.jhu.edu/causalbank
Open Datasets	Yes	Thus we harvest a large causal dataset from the preprocessed large-scale English Common Crawl corpus (5.14 TB) [Buck et al., 2014]. ... COPA [Roemmele et al., 2011] dataset
Dataset Splits	Yes	The training is stopped when the validation loss stagnates for 20,000 batches. ... The bottom of Table 2 shows the large Transformer model constantly achieves the best performance on development set, which contains 5,000 CE pairs.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	We use Sockeye [Hieber et al., 2017] to train Transformer-based [Vaswani et al., 2017] conditional generation models... - The paper mentions software by name but does not provide specific version numbers for Sockeye, Transformer, or other programming languages/libraries used.
Experiment Setup	Yes	The small model s encoder and decoder both have 6 layers, with a hidden size and embedding size of 512. The big model s encoder and decoder have 12 layers and 4 layers, with a hidden size and embedding size of 768, leading to 134M parameters in total. The vocabulary size is 15,000. The training is stopped when the validation loss stagnates for 20,000 batches. ... m > 0 is the margin loss function parameter, which is set to 0.3. ... λ is the parameter for L2 regularization, which is set to 0.00001.