LAMPAT: Low-Rank Adaption for Multilingual Paraphrasing Using Adversarial Training

Authors: Khoi M. Le, Trinh Pham, Tho Quan, Anh Tuan Luu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Throughout the experiments, we found out that our method not only works well for English but can generalize on unseen languages as well.
Researcher Affiliation Collaboration Khoi M. Le1, 2*, Trinh Pham2*, Tho Quan2, Anh Tuan Luu3 1Vin AI Research, Vietnam 2Ho Chi Minh City University of Technology (HCMUT), VNU-HCM, Ho Chi Minh City, Vietnam 3Nanyang Technological University, Singapore
Pseudocode Yes Algorithm 1: Low-rank Adaptation Multilingual Paraphrasing using Adversarial Training.
Open Source Code Yes Data and code are available at https://github.com/phkhanhtrinh23/ LAMPAT.
Open Datasets Yes To assess the fine-tuning, we choose to use the latest version WMT19 (Foundation 2019) to train the model. This dataset covers a wide range of 15 languages including Arabic, Czech, German, English, Spanish, French, Hindi, Indonesian, Italian, Japanese, Kazakh, Dutch, Portuguese, Russian, and Chinese. The WMT19 dataset we use is in its latest version, which is just released in 2023. To balance language resources, we employ a uniform distribution to sample sentences, creating a training set of nearly 600k sentences and a validation set of around 100k sentences.
Dataset Splits Yes To balance language resources, we employ a uniform distribution to sample sentences, creating a training set of nearly 600k sentences and a validation set of around 100k sentences.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments.
Software Dependencies No The paper mentions 'm GPT model' and 'Lo RA implementation from PEFT 1' but does not provide specific version numbers for software dependencies or libraries like Python, PyTorch, or TensorFlow.
Experiment Setup No The paper describes high-level experimental procedures such as corrupting input by removing stop words and shuffling words 33% of the time, using uniform distribution for sampling, and employing Lo RA and VAT, but it lacks specific numerical hyperparameters (e.g., learning rate, batch size, epochs) or detailed system-level training configurations.