Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CAR-Transformer: Cross-Attention Reinforcement Transformer for Cross-Lingual Summarization

Authors: Yuang Cai, Yuyu Yuan

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our approach demonstrates more consistent improvement across CLS tasks compared to traditional multi-task training methods and outperforms the fine-tuned vanilla m BART by 3.67 and the best-performing multi-task training approach by 1.48 in ROUGE-L F1 score on the Wiki Lingua Korean-to-English CLS task.
Researcher Affiliation Academia Yuang Cai, Yuyu Yuan* Beijing University of Posts and Telecommunications Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education EMAIL
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. It provides mathematical equations and a model architecture diagram, but no step-by-step pseudo-code.
Open Source Code Yes The training and evaluation codes are implemented based on Hugging Face Transformers 2. For the detailed experiment setting and implementation, please refer to the source code in supplementary files.
Open Datasets Yes We use Wiki Lingua (Ladhak et al. 2020), Global Voice (Nguyen and Daum e III 2019), and Cross Sum (Bhattacharjee et al. 2021) for training and evaluation.
Dataset Splits No The paper mentions evaluating models 'using the validation set' and plotting 'Validation ROUGE-L scores', but it does not provide specific details on the dataset split ratio or number of samples allocated for the validation set, which is required for reproducibility.
Hardware Specification Yes The training and evaluation procedures for each task are performed on a single NVIDIA A40 GPU.
Software Dependencies No The paper states: 'The training and evaluation codes are implemented based on Hugging Face Transformers'. While it mentions a specific library, it does not provide a version number for Hugging Face Transformers or any other software dependencies, which is necessary for reproducibility.
Experiment Setup Yes We truncate the source document to 512 tokens as input for the encoder, while the ground-truth summary in the target language, serving as input for the decoder, is truncated to 128 tokens. Similarly, the supervision signal for the CAR module, which comprises the ground-truth summary in the source language, is also truncated to 128 tokens. We fine-tune our approach and all baseline approaches on each CLS task for a total of 30 epochs utilizing the training set. With a training batch size of 8, we employ a gradient accumulation step of 2.