Distinguish Polarity in Bag-of-Words Visualization

Authors: Yusheng Xie, Zhengzhang Chen, Ankit Agrawal, Alok Choudhary

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a real Facebook corpus, our experiments show significant improvement in t-SNE visualization as a result of the proposed modification. We conduct experiments on two text corpora (MNIST is also used but only to help reader understand the problem in the familiar MNIST setting). Table 3 summarizes the quantitative assessments.
Researcher Affiliation Collaboration 1Baidu Research, Sunnyvale, CA USA 2Northwestern University, Evanston, IL USA
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions 'Mikolov’s word2vec software' with a link (https://code.google.com/p/word2vec/) but does not provide specific access to the source code for the methodology described in this paper.
Open Datasets Yes We conduct experiments on two text corpora (MNIST is also used but only to help reader understand the problem in the familiar MNIST setting). The first corpus is the first 100 million bytes of English Wikipedia dump (enwik100).
Dataset Splits No The paper does not explicitly provide details about training, validation, or test dataset splits for its experiments, only mentioning the training of word vectors on the full corpora.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No The paper mentions 'Mikolov’s word2vec software' without specifying its version number and refers to the L-BFGS algorithm without detailing specific software implementations or their versions.
Experiment Setup Yes We train word vectors on both enwik100 and enesfb260 using Mikolov’s word2vec software with modest settings: 100 dimensional vectors trained with negative sampling (25 negative samples per positive sample). Figure 4 shows different sparsity and concentration levels in neuron activation distribution achieved by different k values in the β term introduced in Equation 5, with a fixed sparsity level c = 0.4. after just 10 iterations of L-BFGS (Liu and Nocedal 1989), where we notice k = 2 converges better than k = 1.