Educating Text Autoencoders: Latent Representation Guidance via Denoising

Authors: Tianxiao Shen, Jonas Mueller, Dr.Regina Barzilay, Tommi Jaakkola

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our systematic evaluations demonstrate that DAAE maintains the best trade-off between producing high-quality text vs. informative sentence representations. We further investigate the extent to which text can be manipulated via simple transformations of latent representations.
Researcher Affiliation Collaboration 1MIT CSAIL 2Amazon Web Services. Correspondence to: Tianxiao Shen <tianxiao@mit.edu>.
Pseudocode No The paper describes algorithms but does not include pseudocode or algorithm blocks.
Open Source Code Yes Our code and data are available at https://github. com/shentianxiao/text-autoencoders
Open Datasets Yes The Yelp dataset is from Shen et al. (2017), which has 444K/63K/127K sentences of less than 16 words in length as train/dev/test sets, with a vocabulary of 10K. Our second dataset of Yahoo answers is from Yang et al. (2017). It was originally document-level. We perform sentence segmentation and keep sentences with length from 2 to 50 words. The resulting dataset has 495K/49K/50K sentences for train/dev/test sets, with vocabulary size 20K.
Dataset Splits Yes The Yelp dataset is from Shen et al. (2017), which has 444K/63K/127K sentences of less than 16 words in length as train/dev/test sets, with a vocabulary of 10K. Our second dataset of Yahoo answers is from Yang et al. (2017). It was originally document-level. We perform sentence segmentation and keep sentences with length from 2 to 50 words. The resulting dataset has 495K/49K/50K sentences for train/dev/test sets, with vocabulary size 20K.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions RNNs and Transformer models as architectures, but does not provide specific version numbers for software dependencies or libraries used.
Experiment Setup Yes Hyperparameters are set to values that produce the best overall generative models (see Section 5.2). Based on these results, we set β = 0.15 for β-VAE, λ1 = 0.05 for LAAE, and p = 0.3 for DAAE in the neighborhood preservation and text manipulation experiments, to ensure they have strong reconstruction abilities and encode enough information about data.