reproducibilityindex.ai

Variational Autoencoder for Semi-Supervised Text Classification

Authors: Weidi Xu, Haoze Sun, Chao Deng, Ying Tan

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on Large Movie Review Dataset (IMDB) and AG s News corpus show that the proposed approach signiﬁcantly improves the classiﬁcation accuracy compared with pure-supervised classiﬁers, and achieves competitive performance against previous advanced methods.
Researcher Affiliation	Academia	Weidi Xu, Haoze Sun, Chao Deng, Ying Tan Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China wead hsu@pku.edu.cn, pkucissun@foxmail.com, cdspace678@pku.edu.cn, ytan@pku.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It provides mathematical equations for specific components like CLSTM-II, but not a full algorithm.
Open Source Code	No	The paper does not provide an explicit statement or a link indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	Large Movie Review Dataset (IMDB) (Maas et al. 2011) and AG s News corpus (Zhang, Zhao, and Le Cun 2015).
Dataset Splits	Yes	In both datasets we split 20% samples from train set as valid set.
Hardware Specification	Yes	Table 5: Time cost of training 1 epoch using different optimization methods on Nvidia GTX Titan-X GPU.
Software Dependencies	No	The system was implemented using Theano (Bastien et al. 2012; Bergstra et al. 2010) and Lasagne (Dieleman et al. 2015). Specific version numbers for these software components are not provided.
Experiment Setup	Yes	The models were trained end-to-end using the ADAM (Kingma and Ba 2015) optimizer with learning rate of 4e-3. The cost annealing trick (Bowman et al. 2016; Kaae Sønderby et al. 2016) was adopted to smooth the training by gradually increasing the weight of KL cost from zero to one. Word dropout (Bowman et al. 2016) technique is also utilized and the rate was scaled from 0.25 to 0.5 in our experiments. Hyper-parameter α was scaled from 1 to 2. We apply both dropout (Srivastava et al. 2014) and batch normalization (Ioffe and Szegedy 2015) to the output of the word embedding projection layer and to the feature vectors that serve as the inputs and outputs to the MLP that precedes the ﬁnal layer. In all the experiments, we used 512 units for memory cells, 300 units for the input embedding projection layer and 50 units for latent variable z.