CAR-Transformer: Cross-Attention Reinforcement Transformer for Cross-Lingual Summarization
Authors: Yuang Cai, Yuyu Yuan
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach demonstrates more consistent improvement across CLS tasks compared to traditional multi-task training methods and outperforms the fine-tuned vanilla m BART by 3.67 and the best-performing multi-task training approach by 1.48 in ROUGE-L F1 score on the Wiki Lingua Korean-to-English CLS task. |
| Researcher Affiliation | Academia | Yuang Cai, Yuyu Yuan* Beijing University of Posts and Telecommunications Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education {cyang,yuanyuyu}@bupt.edu.cn |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. It provides mathematical equations and a model architecture diagram, but no step-by-step pseudo-code. |
| Open Source Code | Yes | The training and evaluation codes are implemented based on Hugging Face Transformers 2. For the detailed experiment setting and implementation, please refer to the source code in supplementary files. |
| Open Datasets | Yes | We use Wiki Lingua (Ladhak et al. 2020), Global Voice (Nguyen and Daum e III 2019), and Cross Sum (Bhattacharjee et al. 2021) for training and evaluation. |
| Dataset Splits | No | The paper mentions evaluating models 'using the validation set' and plotting 'Validation ROUGE-L scores', but it does not provide specific details on the dataset split ratio or number of samples allocated for the validation set, which is required for reproducibility. |
| Hardware Specification | Yes | The training and evaluation procedures for each task are performed on a single NVIDIA A40 GPU. |
| Software Dependencies | No | The paper states: 'The training and evaluation codes are implemented based on Hugging Face Transformers'. While it mentions a specific library, it does not provide a version number for Hugging Face Transformers or any other software dependencies, which is necessary for reproducibility. |
| Experiment Setup | Yes | We truncate the source document to 512 tokens as input for the encoder, while the ground-truth summary in the target language, serving as input for the decoder, is truncated to 128 tokens. Similarly, the supervision signal for the CAR module, which comprises the ground-truth summary in the source language, is also truncated to 128 tokens. We fine-tune our approach and all baseline approaches on each CLS task for a total of 30 epochs utilizing the training set. With a training batch size of 8, we employ a gradient accumulation step of 2. |