reproducibilityindex.ai

Improved Variational Autoencoders for Text Modeling using Dilated Convolutions

Authors: Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, Taylor Berg-Kirkpatrick

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In experiments, we ﬁnd that there is a trade-off between contextual capacity of the decoder and effective use of encoding information. We demonstrate perplexity gains on two datasets, representing the ﬁrst positive language modeling result with VAE. Further, we conduct an in-depth investigation of the use of VAE (with our new decoding architecture) for semi-supervised and unsupervised labeling tasks, demonstrating gains over several strong baselines.
Researcher Affiliation	Academia	1Carnegie Mellon University. Correspondence to: Zichao Yang <zichaoy@cs.cmu.edu>.
Pseudocode	No	The paper describes the model architecture and training procedures in detail but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We use two large scale document classiﬁcation data sets: Yahoo Answer and Yelp15 review, representing topic classiﬁcation and sentiment classiﬁcation data sets respectively (Tang et al., 2015; Yang et al., 2016; Zhang et al., 2015).
Dataset Splits	Yes	The original data sets contain millions of samples, of which we sample 100k as training and 10k as validation and test from the respective partitions.
Hardware Specification	No	The paper discusses model configurations and training details but does not provide any specific hardware details such as GPU or CPU models, or memory specifications used for the experiments.
Software Dependencies	No	The paper mentions using Adam for optimization but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	We use a vocabulary size of 20k for both data sets and set the word embedding dimension to be 512. The LSTM dimension is 1024. ... We use Adam (Kingma & Ba, 2014) to optimize all models and the learning rate is selected from [2e-3, 1e-3, 7.5e-4] and β1 is selected from [0.5, 0.9]. ... We select drop out ratio of LSTMs (both encoder and decoder) from [0.3, 0.5]. ... We use batch size of 32 and all model are trained for 40 epochs.