Distributed Representation of Words in Cause and Effect Spaces
Authors: Zhipeng Xie, Feiteng Mu7330-7337
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results have shown that Max-Matching and Attentive-Matching models significantly outperform several state-of-the-art competitors by a large margin on both English and Chinese corpora. |
| Researcher Affiliation | Academia | Zhipeng Xie, Feiteng Mu School of Computer Science Shanghai Key Laboratory of Data Science Fudan University, Shanghai, China |
| Pseudocode | No | The paper describes the models (Pairwise-Matching, Max-Matching, Attentive-Matching) using mathematical equations and descriptive text, but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | No | The paper does not contain any statement about releasing the source code for the proposed models or a link to a code repository. |
| Open Datasets | Yes | To make an evaluation on English, we build our models on a corpus of 815,233 cause-effect phrase pairs which was extracted with a set of 13 rules from Gigaword and Simple English Wikipedia. Both the rules and the corpus are taken from (Sharp et al. 2016)1. 1http://clulab.cs.arizona.edu/data/emnlp2016-causal/ ... We apply the above causal patterns on two raw Chinese corpora, the Baike corpus and the Sogou CS corpus, where Baike is a 10GB data crawled from a Chinese encyclopedia website and Sogou CS4 (Wang et al. 2008) is the news data on the web |
| Dataset Splits | No | The paper mentions "five-fold cross validation" for the Causal QA Task, but does not provide specific train/validation/test dataset splits (e.g., percentages or counts) for the main causal embedding model training. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Py LTP" as a Chinese dependency parser, but does not specify its version. It also refers to "word2vec" and "SVM ranker" without version numbers. |
| Experiment Setup | Yes | We use simple gradient descent algorithm to train our models, with learning rate of 0.005. Other related hyperparameters are listed as follows. The number of training epochs are set to 30, and the batch size is 256. The words whose frequencies are less than 8 are pruned. The cause embeddings and the effect embeddings have the same dimensionality of 200. The negative sampling rate is 10, which means that we samples 10 negative phrase pairs for each positive phrase pair. ... where α and γ are two hyperparameters, which are set to 0.8 and 2.0 by default. |