reproducibilityindex.ai

Semisupervised Autoencoder for Sentiment Analysis

Authors: Shuangfei Zhai, Zhongfei Zhang

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of our model on six sentiment analysis datasets, and show that our model signiﬁcantly outperforms all the competing methods with respect to classiﬁcation accuracy. We also show that our model is able to take advantage of unlabeled dataset and get improved performance.
Researcher Affiliation	Academia	Shuangfei Zhai, Zhongfei (Mark) Zhang Computer Science Department, Binghamton University 4400 Vestal Pkwy E, Binghamton, NY 13902 szhai2@binghamton.edu zhongfei@cs.binghamton.edu
Pseudocode	No	The paper includes mathematical equations and descriptions of methods but does not provide any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include any links to a code repository.
Open Datasets	Yes	We evaluate our model on six Sentiment Analysis benchmarks. The ﬁrst one is the IMDB dataset 1 (Maas et al. 2011)...1http://ai.stanford.edu/ amaas/data/sentiment/. The rest ﬁve datasets are all collected from Amazon 2(Blitzer, Dredze, and Pereira 2007)...2http://www.cs.jhu.edu/ mdredze/datasets/sentiment/
Dataset Splits	Yes	Table 1: Statistics of the datasets. IMDB # train 25,000 # test 25,000 # unlabeled 50,000. books # train 10,000 # test 3,105.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments, such as specific GPU or CPU models, or cloud computing resources.
Software Dependencies	No	The paper mentions types of software components like 'SVM2', 'Logistic Regression', 'Re Lu', 'Sigmoid', and 'Stochastic Gradient Descent with momentum', but it does not specify any version numbers for these or other software libraries/frameworks.
Experiment Setup	Yes	We use Re Lu max(0, x) as the activation function, and Sigmoid as the decoding function. For SBDAE and NN, a small hidden size is sufﬁcient, so we use 200. For DAE, we observe that it beneﬁts from very large hidden sizes; however, due to computational constraints, we take 2000. All the models are trained with mini-batch Stochastic Gradient Descent with momentum of 0.9. We cross validate β from the set {104, 105, 106, 107, 108}.