Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation
Authors: Ke Wang, Hang Hua, Xiaojun Wan
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our model achieves very competitive performance on three public data sets. Furthermore, we also show that our model can not only control the degree of transfer freely but also allow transferring over multiple aspects at the same time. |
| Researcher Affiliation | Academia | Ke Wang Hang Hua Xiaojun Wan Wangxuan Institute of Computer Technology, Peking University The MOE Key Laboratory of Computational Linguistics, Peking University {wangke17, huahang, wanxiaojun}@pku.edu.cn |
| Pseudocode | Yes | Algorithm 1 Fast Gradient Iterative Modification Algorithm. |
| Open Source Code | Yes | Our codes are available at https://github.com/Nrgeup/controllable-text-attribute-transfer |
| Open Datasets | Yes | We use datasets provided in Li et al. [17] for sentiment and style transfer experiments, where the test sets contain human-written references. Yelp: This dataset consists of Yelp reviews for flipping sentiment... Amazon: This dataset consists of product reviews from Amazon [9]... Captions: This dataset consists of image captions [6]... Beer Advocate dataset, which was scraped from Beer Advocate [19]. |
| Dataset Splits | Yes | The statistics of the above three datasets are shown in Table 1. Dataset Styles #Train #Dev #Test #Vocab Max-Length Mean-Length Yelp Negative 180,000 2,000 500 Positive 270,000 2,000 500 |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running experiments. |
| Software Dependencies | Yes | We implement our model based on Pytorch 0.4. |
| Experiment Setup | Yes | In our Transformer-based autoencoder, the embedding size, the latent size and the dimension size of self-attention are all set to 256. The hidden size of GRU and batch-size are set to 128. The inner dimension of Feed-Forward Networks (FFN) in Transformer is set to 1024. Besides, each of the encoder and decoder is stacked by two layers of Transformer. The smoothing parameter ε is set to 0.1. For the classifier, the dimensions of the two linear layers are 100 and 50. For our FGIM, the weight set w, the threshold t and the decay coefficient λ are set to {1.0, 2.0, 3.0, 4.0, 5.0, 6.0}, 0.001 and 0.9, respectively. The optimizer we use is Adam [15] and the initial learning rate is 0.001. |