reproducibilityindex.ai

Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective

Authors: Akiyoshi Tomihari, Issei Sato

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments using a Transformer-based model on multiple natural language processing datasets confirm our theoretical analysis." and "5 Numerical evaluation with transformer models In this section, we numerically justify the following aspects obtained from our analysis:
Researcher Affiliation	Academia	Akiyoshi Tomihari The University of Tokyo tomihari@g.ecc.u-tokyo.ac.jp Issei Sato The University of Tokyo sato@g.ecc.u-tokyo.ac.jp
Pseudocode	No	No pseudocode or algorithm blocks are explicitly presented or labeled in the paper.
Open Source Code	Yes	Code is available at https://github.com/tom4649/lp-ft_ntk.
Open Datasets	Yes	Datasets and models We used a total of 13 classification datasets from various benchmarks: Super GLUE [Wang et al., 2019], GLUE [Wang et al., 2018], BOSS [Yuan et al., 2023], and Pub Med 20k RCT [Dernoncourt and Lee, 2017].
Dataset Splits	Yes	For the datasets from the GLUE, Super GLUE, and BOSS benchmarks, we divided the original training set using a 9:1 training-to-validation ratio, using the original validation set as the test set, in accordance with Chen et al. [2022].
Hardware Specification	Yes	All experiments were run on a single NVIDIA A100 GPU.
Software Dependencies	No	Our code is built on Py Torch [Paszke et al., 2019], using the Hugging Face Transformers library [Wolf et al., 2020] and Adapter Hub [Pfeiffer et al., 2020].
Experiment Setup	Yes	Hyperparameter tuning, especially for learning rates during the FT stage of LP-FT, was conducted through a grid search based on the validation set performance." and "Details on the hyperparameters for our experiments can be found in Table 6.