reproducibilityindex.ai

Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation

Authors: Kai Huang, Hanyun Yin, Heng Huang, Wei Gao

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results show that Green Trainer can save up to 64% training FLOPs compared to full fine-tuning, without any noticeable accuracy loss.
Researcher Affiliation	Academia	University of Pittsburgh , University of Maryland, College Park University of Science and Technology of China
Pseudocode	No	The paper contains diagrams and explanations but does not include any formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets	Yes	Our experiments are mainly conducted using the following two datasets of abstractive summarization: Sci TLDR (Cachola et al., 2020) and Dialog Sum (Chen et al., 2021). We also perform generative QA tasks on Web Question (Berant et al., 2013) and PIQA (Bisk et al., 2020) datasets in Appendix A.4.
Dataset Splits	No	The paper mentions using 'test data' but does not provide specific percentages or counts for train/validation/test splits, nor does it refer to standard predefined splits with sufficient detail for reproduction.
Hardware Specification	No	The paper mentions 'A100-80GB GPUs' in an example scenario in the introduction, and refers to 'GPUs we use' in the appendix, but it does not specify the exact GPU models, CPUs, or detailed hardware configurations used for their experiments.
Software Dependencies	No	The paper mentions 'Py Torch (Paszke et al., 2019)' but does not provide specific version numbers for PyTorch or any other software dependencies crucial for replication.
Experiment Setup	Yes	In all experiments, we use a batch size of 4 and fine-tune the model for 5 epochs. We use the Adam W optimizer (Loshchilov and Hutter, 2017) at a learning rate of 2 10 5 with linear schedule and weight decay of 10 2.