Boosting Causal Embeddings via Potential Verb-Mediated Causal Patterns

Authors: Zhipeng Xie, Feiteng Mu

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results have shown that the boosted causal embeddings outperform several state-of-the-arts significantly on both English and Chinese.
Researcher Affiliation Academia Zhipeng Xie and Feiteng Mu Shanghai Key Laboratory of Data Science, Fudan University School of Computer Science, Fudan University {xiezp, 17210240011}@fudan.edu.cn
Pseudocode No The paper describes its methods using prose and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper provides links to datasets and pre-trained embeddings, but there is no explicit statement or link for the open-source code implementing the methodology described in the paper.
Open Datasets Yes To make an evaluation on English, we make use of a high-precision corpus Eh of 815,233 cause-effect phrase pairs which was extracted with a set of 13 hand-crafted rules from Gigaword and Simple English Wikipedia. Both the rules and the corpus are taken from [Sharp et al., 2016]3. (Footnote 3: http://clulab.cs.arizona.edu/data/emnlp2016-causal/) and To make evaluation on Chinese, we use the Sogou7 set of 517,746 high-precision causal phrase pairs which were extracted by Xie and Mu [2019] from the Sogou CS8 news corpus [Wang et al., 2008]. (Footnote 7: http://www.ke.fudan.edu.cn/data/causal/sg_hp_extractions.txt)
Dataset Splits No The paper describes datasets used for training initial embeddings and for testing, but it does not specify explicit training/validation/test dataset splits for its own boosted causal embeddings experiments.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like the SpaCy dependency parser and LTP toolkit but does not specify their version numbers or other software dependencies with version details.
Experiment Setup Yes A word pair (w1, w2) is a candidate causal word pair if it serves as the dominant word pair of at least λ1 phrase pairs in Eh and its causal interaction score is not less than λ2 (λ1 and λ2 are set to 30 and 0.55 by default). and where wpi is the i-th word pair in Mpwp, N = |Mpwp| is the total number of word pairs in Mpwp, and λ is the regularization coefficient that is set to 10 5 by default. and v Embed: The vanilla word embeddings trained on raw text corpus by the skip-gram algorithm [Mikolov et al., 2013] with a sliding window of 5