Educating Text Autoencoders: Latent Representation Guidance via Denoising
Authors: Tianxiao Shen, Jonas Mueller, Dr.Regina Barzilay, Tommi Jaakkola
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our systematic evaluations demonstrate that DAAE maintains the best trade-off between producing high-quality text vs. informative sentence representations. We further investigate the extent to which text can be manipulated via simple transformations of latent representations. |
| Researcher Affiliation | Collaboration | 1MIT CSAIL 2Amazon Web Services. Correspondence to: Tianxiao Shen <tianxiao@mit.edu>. |
| Pseudocode | No | The paper describes algorithms but does not include pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and data are available at https://github. com/shentianxiao/text-autoencoders |
| Open Datasets | Yes | The Yelp dataset is from Shen et al. (2017), which has 444K/63K/127K sentences of less than 16 words in length as train/dev/test sets, with a vocabulary of 10K. Our second dataset of Yahoo answers is from Yang et al. (2017). It was originally document-level. We perform sentence segmentation and keep sentences with length from 2 to 50 words. The resulting dataset has 495K/49K/50K sentences for train/dev/test sets, with vocabulary size 20K. |
| Dataset Splits | Yes | The Yelp dataset is from Shen et al. (2017), which has 444K/63K/127K sentences of less than 16 words in length as train/dev/test sets, with a vocabulary of 10K. Our second dataset of Yahoo answers is from Yang et al. (2017). It was originally document-level. We perform sentence segmentation and keep sentences with length from 2 to 50 words. The resulting dataset has 495K/49K/50K sentences for train/dev/test sets, with vocabulary size 20K. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions RNNs and Transformer models as architectures, but does not provide specific version numbers for software dependencies or libraries used. |
| Experiment Setup | Yes | Hyperparameters are set to values that produce the best overall generative models (see Section 5.2). Based on these results, we set β = 0.15 for β-VAE, λ1 = 0.05 for LAAE, and p = 0.3 for DAAE in the neighborhood preservation and text manipulation experiments, to ensure they have strong reconstruction abilities and encode enough information about data. |