Mask and Infill: Applying Masked Language Model for Sentiment Transfer

Authors: Xing Wu, Tao Zhang, Liangjun Zang, Jizhong Han, Songlin Hu

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on two review datasets with quantitative, qualitative, and human evaluations. Experimental results demonstrate that our models improve state-of-the-art performance.
Researcher Affiliation Academia 1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Pseudocode Yes Algorithm 1 Implementation of Mask and Infill approach.
Open Source Code No The paper does not provide a direct link to its own source code or explicitly state that its code is released. The provided links are for baseline models or evaluation tools.
Open Datasets Yes We experiment our methods on two review datasets from [Li et al., 2018]: Yelp and Amazon [He and Mc Auley, 2016]
Dataset Splits Yes We experiment our methods on two review datasets from [Li et al., 2018]: Yelp and Amazon [He and Mc Auley, 2016], each of which is randomly split into training, validation and testing sets.
Hardware Specification No The paper does not specify any particular hardware components such as GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions using 'pre-trained BERTbase' and 'a CNN-based classifier' but does not specify software dependencies with version numbers (e.g., PyTorch 1.x.x, TensorFlow 2.x.x).
Experiment Setup Yes The input size is kept compatible with original BERT and relevant hyperparameters can be found in [Devlin et al., 2018]. The pre-trained discriminator is a CNN-based classifier [Kim, 2014] with convolutional filters of size 3, 4, 5 and use Word Piece embeddings. The hyperparameters in Equation 10 and 11 are selected by a grid-search method using the validation set. We fine-tune BERT to AC-MLM for 10 epochs, and further train 6 epochs to apply discriminator constraint.