Writing Polishment with Simile: Task, Dataset and A Neural Approach
Authors: Jiayi Zhang, Zhi Cui, Xiaoqiang Xia, Yalong Guo, Yanran Li, Chen Wei, Jianwei Cui14383-14392
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results demonstrate the feasibility of WPS task and shed light on the future research directions towards better automatic text polishment. |
| Researcher Affiliation | Industry | Jiayi Zhang, Zhi Cui, Xiaoqiang Xia, Yalong Guo, Yanran Li, Chen Wei and Jianwei Cui Xiaomi AI Lab, Beijing, China {zhangjiayi3, cuizhi, liyanran, weichen, cuijianwei}@xiaomi.com |
| Pseudocode | No | The paper describes the model architecture using text and equations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We develop1 a large-scale Chinese Simile (CS) dataset for public research, which contains millions of similes with contexts extracted from Chinese online fictions. 1https://github.com/mrzjy/writing-polishment-with-simile.git |
| Open Datasets | Yes | We develop1 a large-scale Chinese Simile (CS) dataset for public research, which contains millions of similes with contexts extracted from Chinese online fictions. 1https://github.com/mrzjy/writing-polishment-with-simile.git |
| Dataset Splits | Yes | Data Split Avg. Length Train Dev Test Context Simile # Uniq. Simile 5,485,721 2,500 2,500 52.7 8.3 3M |
| Hardware Specification | Yes | All models are implemented in Tensorflow3 and trained on Nvidia Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'TensorFlow' and 'BERT tokenizer' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We use dropout of 0.1, and Adam optimizer (Kingma and Ba 2015) with a mini-batch of 128. The max context and target length are set to 128 and 16 respectively. For Match BERT, we train with random negative sampling of size 5. Without hyper-parameter tuning, we set the learning rate to 5e-5 and train for maximum of 15 epochs with early stopping on Dev set. |