reproducibilityindex.ai

Idiomatic Expression Paraphrasing without Strong Supervision

Authors: Jianing Zhou, Ziheng Zeng, Hongyu Gong, Suma Bhat11774-11782

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of the proposed solutions compared to competitive baselines is seen in the relative gains of over 5.16 points in BLEU, over 8.75 points in METEOR, and over 19.57 points in SARI when the generated sentences are empirically validated on a parallel dataset using automatic and manual evaluations.
Researcher Affiliation	Collaboration	1 University of Illinois at Urbana-Champaign 2 Facebook AI
Pseudocode	Yes	Algorithm 1: Weakly Supervised Model
Open Source Code	Yes	1The code and dataset are available at https://github.com/zhjjn/ISP.git.
Open Datasets	Yes	Accordingly, we choose two large news datasets AG News (Zhang, Zhao, and Le Cun 2015) and CNN-Dailymail (See, Liu, and Manning 2017) and the GLUE datasets MRPC and COLA (Wang et al. 2018)." and "we used the parallel dataset constructed by Zhou, Gong, and Bhat (2021a) (henceforth termed PIL)" and "The idiomatic sentences (without literal counterparts) used for BART-IBT training are from the MAGPIE corpus (Haagsma, Bos, and Nissim 2020) collected from the BNC.
Dataset Splits	No	No explicit details on train/validation/test splits, their percentages, or sample counts were provided for all datasets, nor was a detailed splitting methodology given for reproducibility.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were provided.
Software Dependencies	No	We use the pretrained BART-large model, the BERT-based POS tagger and their respective checkpoints as implemented and hosted by Huggingface s Transformers library. The Ro BERTa-based sentence embedding generator and its checkpoint are implemented and hosted by (Reimers and Gurevych 2020)." This text mentions libraries but no specific versions.
Experiment Setup	Yes	The maximum length for a sentence, the learning rate and the number of iterations were 128, 5e 5, and 5 respectively. The other hyper-parameters were their default values." and "The model was trained for 5 epochs. During inference, we used a beam search with 5 beams with top-k set to 100 and top-p set to 0.5. The other hyper-parameters were set to their default values.