Distinguish Polarity in Bag-of-Words Visualization
Authors: Yusheng Xie, Zhengzhang Chen, Ankit Agrawal, Alok Choudhary
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a real Facebook corpus, our experiments show significant improvement in t-SNE visualization as a result of the proposed modification. We conduct experiments on two text corpora (MNIST is also used but only to help reader understand the problem in the familiar MNIST setting). Table 3 summarizes the quantitative assessments. |
| Researcher Affiliation | Collaboration | 1Baidu Research, Sunnyvale, CA USA 2Northwestern University, Evanston, IL USA |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions 'Mikolov’s word2vec software' with a link (https://code.google.com/p/word2vec/) but does not provide specific access to the source code for the methodology described in this paper. |
| Open Datasets | Yes | We conduct experiments on two text corpora (MNIST is also used but only to help reader understand the problem in the familiar MNIST setting). The first corpus is the first 100 million bytes of English Wikipedia dump (enwik100). |
| Dataset Splits | No | The paper does not explicitly provide details about training, validation, or test dataset splits for its experiments, only mentioning the training of word vectors on the full corpora. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Mikolov’s word2vec software' without specifying its version number and refers to the L-BFGS algorithm without detailing specific software implementations or their versions. |
| Experiment Setup | Yes | We train word vectors on both enwik100 and enesfb260 using Mikolov’s word2vec software with modest settings: 100 dimensional vectors trained with negative sampling (25 negative samples per positive sample). Figure 4 shows different sparsity and concentration levels in neuron activation distribution achieved by different k values in the β term introduced in Equation 5, with a fixed sparsity level c = 0.4. after just 10 iterations of L-BFGS (Liu and Nocedal 1989), where we notice k = 2 converges better than k = 1. |