reproducibilityindex.ai

Parameter-Efficient Fine-Tuning with Discrete Fourier Transform

Authors: Ziqi Gao, Qichao Wang, Aochuan Chen, Zijing Liu, Bingzhe Wu, Liang Chen, Jia Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, our Fourier FT method shows comparable or better performance with fewer parameters than Lo RA on various tasks, including natural language understanding, natural language generation, instruction tuning, and image classification. For example, when performing instruction tuning on the LLa MA2-7B model, Fourier FT surpasses Lo RA with only 0.064M trainable parameters, compared to Lo RA s 33.5M. Our code is released at https: //github.com/Chaos96/fourierft.
Researcher Affiliation	Collaboration	1Hong Kong University of Science and Technology (Guangzhou) 2Hong Kong University of Science and Technology 3Sun Yat-sen University 4International Digital Economy Academy 5AI Lab, Tencent.
Pseudocode	Yes	Algorithm 1 Py Torch-style pseudocode for Fourier FT.
Open Source Code	Yes	Our code is released at https: //github.com/Chaos96/fourierft.
Open Datasets	Yes	GLUE benchmark (General Language Understanding Evaluation (Wang et al., 2018)), E2E natural language generation (NLG) task (Novikova et al., 2017)), Alpaca dataset (Taori et al., 2023), Image Net-21K dataset (Ridnik et al., 2021), Oxford Pets (Parkhi et al., 2012), CIFAR10 (Krizhevsky, 2009), DTD (Cimpoi et al., 2014), Euro SAT (Helber et al., 2019), RESISC45 (Cheng et al., 2017), Stanford Cars (Krause et al., 2013), FGVC (Maji et al., 2013), CIFAR100 (Krizhevsky, 2009).
Dataset Splits	Yes	Table 7. Task descriptions and dataset statistics of the GLUE benchmark. ... # Train # Val # Test ...
Hardware Specification	No	The paper mentions 'training on a single GPU' in Section 4.3 but does not specify any particular GPU model (e.g., NVIDIA A100, RTX 2080 Ti), CPU model, or other specific hardware details used for the experiments.
Software Dependencies	No	The paper mentions 'Py Torch-style pseudocode' in Algorithm 1, but it does not list any specific software dependencies with their version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1).
Experiment Setup	Yes	Table 9. Hyperparameter setup of Fourier FT for the GLUE benchmark. ... Optimizer Adam W, LR Schedule Linear, Learning Rate (Fourier FT), Learning Rate (Head), Max Seq. Len, Scaling value, Batch Size ...; Table 10. Hyperparameter setup of Fourier FT on the E2E benchmark. ... Optimizer Adam W, Learning Rate (Fourier FT), Learning Rate (Head), Batch Size, Weight Decay, n, Scaling value α, Epochs, Label Smooth, LR Schedule Linear ...; Table 11. Hyperparameter setup for instruction-tuning of Lo RA and Fourier FT. ... Optimizer Adam W, Warmup Ratio, Batch Size, Accumulation Steps, Epochs, n, Scaling Value α, LR Schedule Linear, Learning Rate ...; Table 12. Hyperparameter setup for image classification of Fourier FT. ... Epochs, Optimizer Adam W, LR Schedule Linear, n, α, Learning Rate (Fourier FT), Learning Rate (Head), Weight Decay