A Probabilistic Formulation of Unsupervised Text Style Transfer
Authors: Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our model on five style transfer tasks: sentiment transfer, word substitution decipherment, formality transfer, author imitation, and related language translation. For completeness, we also evaluate on the task of general unsupervised machine translation using standard benchmarks. ... Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines... Further, we conduct experiments on a standard unsupervised machine translation task and find that our unified approach matches the current state-of-the-art. |
| Researcher Affiliation | Academia | Junxian He , Xinyi Wang , Graham Neubig Carnegie Mellon University {junxianh,xinyiw1,gneubig}@cs.cmu.edu Taylor Berg-Kirkpatrick University of California San Diego tberg@eng.ucsd.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data are available at https://github.com/cindyxinyiwang/deep-latent-sequence-model. |
| Open Datasets | Yes | We use the Yelp reviews dataset collected by Shen et al. (2017) which contains 250K negative sentences and 380K positive sentences. We use the GYAFC dataset (Rao & Tetreault, 2018), which contains formal and informal sentences from two different domains. ... The dataset we use is a collection of Shakespeare s plays translated line by line into modern English. It was collected by Xu et al. (2012)... We use the data in (Yang et al., 2018) which provides 200K sentences from each domain. ... we also evaluate on the WMT 16 German English translation task. |
| Dataset Splits | Yes | For our method we select the model with the best validation ELBO, and for UNMT or BT+NLL we select the model with the best back-translation loss. ... GYAFC dataset (Rao & Tetreault, 2018), which contains formal and informal sentences from two different domains. In this paper, we use the Entertainment and Music domain, which has about 52K training sentences, 5K development sentences, and 2.5K test sentences. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using architectures like LSTM and the XLM codebase, but it does not specify version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, etc.). |
| Experiment Setup | Yes | Complete model configurations and hyperparameters can be found in Appendix A.1. ... We use word embeddings of size 128. We use 1 layer LSTM with hidden size of 512 as both the encoder and decoder. We apply dropout to the readout states before softmax with a rate of 0.3. ... We vary pooling windows size as {1, 5}, the decaying patience hyperparameter k for selfreconstruction loss (Eq. 4) as {1, 2, 3}. ... We vary the weight λ for the NLL term (BT+NLL) or the KL term (our method) as {0.001, 0.01, 0.03, 0.05, 0.1}. |