Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Writing Polishment with Simile: Task, Dataset and A Neural Approach
Authors: Jiayi Zhang, Zhi Cui, Xiaoqiang Xia, Yalong Guo, Yanran Li, Chen Wei, Jianwei Cui14383-14392
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results demonstrate the feasibility of WPS task and shed light on the future research directions towards better automatic text polishment. |
| Researcher Affiliation | Industry | Jiayi Zhang, Zhi Cui, Xiaoqiang Xia, Yalong Guo, Yanran Li, Chen Wei and Jianwei Cui Xiaomi AI Lab, Beijing, China EMAIL |
| Pseudocode | No | The paper describes the model architecture using text and equations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We develop1 a large-scale Chinese Simile (CS) dataset for public research, which contains millions of similes with contexts extracted from Chinese online fictions. 1https://github.com/mrzjy/writing-polishment-with-simile.git |
| Open Datasets | Yes | We develop1 a large-scale Chinese Simile (CS) dataset for public research, which contains millions of similes with contexts extracted from Chinese online fictions. 1https://github.com/mrzjy/writing-polishment-with-simile.git |
| Dataset Splits | Yes | Data Split Avg. Length Train Dev Test Context Simile # Uniq. Simile 5,485,721 2,500 2,500 52.7 8.3 3M |
| Hardware Specification | Yes | All models are implemented in Tensorflow3 and trained on Nvidia Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'TensorFlow' and 'BERT tokenizer' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We use dropout of 0.1, and Adam optimizer (Kingma and Ba 2015) with a mini-batch of 128. The max context and target length are set to 128 and 16 respectively. For Match BERT, we train with random negative sampling of size 5. Without hyper-parameter tuning, we set the learning rate to 5e-5 and train for maximum of 15 epochs with early stopping on Dev set. |