Causally Denoise Word Embeddings Using Half-Sibling Regression
Authors: Zekun Yang, Tianlin Liu9426-9433
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated on a battery of standard lexical-level evaluation tasks and downstream sentiment analysis tasks, our method reaches state-of-the-art performance. |
| Researcher Affiliation | Academia | Department of Information Systems College of Business City University of Hong Kong Hong Kong SAR, China zekunyang3-c@my.cityu.edu.hk Friedrich Miescher Institute for Biomedical Research Maulbeerstrasse 66 4058 Basel, Switzerland tianlin.liu@fmi.ch |
| Pseudocode | Yes | Algorithm 1: HSR algorithm for word vector postprocessing |
| Open Source Code | Yes | Our codes are available at https://github.com/KunkunYang/denoiseHSR-AAAI |
| Open Datasets | Yes | We test it on three different pre-trained English word embeddings including Word2Vec (Mikolov et al. 2013), GloVe (Pennington, Socher, and Manning 2014), and Paragram (Wieting et al. 2015). The dataset we adopt include Amazon reviews6 (AR), customer reviews (CR) (Hu and Liu 2004), IMDB movie reviews (IMDB) (Maas et al. 2011), and SST binary sentiment classification (SST-B) (Socher et al. 2013) |
| Dataset Splits | Yes | We report the five-fold cross-validation accuracy of the sentiment classification results in Table 3. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions using the Natural Language Toolkit (NLTK) package and a logistic regression model, but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | For HSR, we fix the regularization constants α1, α2 = 50 for HSR-RR. |