reproducibilityindex.ai

Distinguish Polarity in Bag-of-Words Visualization

Authors: Yusheng Xie, Zhengzhang Chen, Ankit Agrawal, Alok Choudhary

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a real Facebook corpus, our experiments show significant improvement in t-SNE visualization as a result of the proposed modification. We conduct experiments on two text corpora (MNIST is also used but only to help reader understand the problem in the familiar MNIST setting). Table 3 summarizes the quantitative assessments.
Researcher Affiliation	Collaboration	1Baidu Research, Sunnyvale, CA USA 2Northwestern University, Evanston, IL USA
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions 'Mikolov’s word2vec software' with a link (https://code.google.com/p/word2vec/) but does not provide specific access to the source code for the methodology described in this paper.
Open Datasets	Yes	We conduct experiments on two text corpora (MNIST is also used but only to help reader understand the problem in the familiar MNIST setting). The first corpus is the first 100 million bytes of English Wikipedia dump (enwik100).
Dataset Splits	No	The paper does not explicitly provide details about training, validation, or test dataset splits for its experiments, only mentioning the training of word vectors on the full corpora.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper mentions 'Mikolov’s word2vec software' without specifying its version number and refers to the L-BFGS algorithm without detailing specific software implementations or their versions.
Experiment Setup	Yes	We train word vectors on both enwik100 and enesfb260 using Mikolov’s word2vec software with modest settings: 100 dimensional vectors trained with negative sampling (25 negative samples per positive sample). Figure 4 shows different sparsity and concentration levels in neuron activation distribution achieved by different k values in the β term introduced in Equation 5, with a ﬁxed sparsity level c = 0.4. after just 10 iterations of L-BFGS (Liu and Nocedal 1989), where we notice k = 2 converges better than k = 1.