Unsupervised Learning for Lexicon-Based Classification
Authors: Jacob Eisenstein
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical evaluation is performed on four datasets in two languages. All datasets involve binary classification problems, and performance is quantified by the area-under-the-curve (AUC), a measure of classification performance that is robust to unbalanced class distributions. |
| Researcher Affiliation | Academia | Jacob Eisenstein Georgia Institute of Technology |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | Yes | Source code: https://github.com/jacobeisenstein/probabilistic-lexicon-classification |
| Open Datasets | Yes | Amazon English-language product reviews across four domains; of these reviews, 8000 are labeled and another 19677 are unlabeled (Blitzer, Dredze, and Pereira 2007). Cornell 2000 English-language film reviews (version 2.0), labeled as positive or negative (Pang and Lee 2004). Corpus Cine 3800 Spanish-language movie reviews, rated on a scale of one to five (Vilares, Alonson, and G omez Rodr ıguez 2015). IMDB 50,000 English-language film reviews (Maas et al. 2011). |
| Dataset Splits | Yes | This classifier is trained using fivefold cross validation. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, specific libraries or frameworks with versions) were mentioned in the paper. |
| Experiment Setup | Yes | For the PROBLEX-MULT and PROBLEX-DCM methods, lexicon words which co-occur with the opposite lexicon at greater than chance frequency are eliminated from the lexicon in a preprocessing step. The penalty parameter ρ is initialized at 1, and then dynamically updated based on the primal and dual residuals. |