reproducibilityindex.ai

Improved Fine-Tuning by Better Leveraging Pre-Training Data

Authors: Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni Chan, Rong Jin

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results for image classification tasks on 8 benchmark data sets verify the effectiveness of the proposed data selection based fine-tuning pipeline.
Researcher Affiliation	Collaboration	1Department of Computer Science, City University of Hong Kong 2School of Artificial Intelligence, Dalian University of Technology 3DAMO Academy, Alibaba Group 4Department of Automation, Tsinghua University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks, only descriptions of the proposed methods in text.
Open Source Code	Yes	Our code is available at https://github.com/ziquanliu/NeurIPS2022_UOT_fine_tuning.
Open Datasets	Yes	The pre-trained model is tested on 8 target image classification data sets, i.e. Stanford dogs (Dogs) [34], Stanford cars (Cars) [35], Caltech-UCSD birds (CUB) [7], Oxford-IIIT Pet (Pets) [36], SUN [37], FGVC-Aircraft (Aircraft) [38], Describable Textures data set (DTD) [39] and Caltech101 (Caltech) [40].
Dataset Splits	Yes	we search the initial learning rate from {1e-4, 3e-4, 1e-3, 3e-3, 1e-2} on a validation set and report the test accuracy trained on the original training or train+val set.
Hardware Specification	No	The paper states 'See the supplemental' for the total amount of compute and type of resources used, but these details are not provided within the main body of the paper.
Software Dependencies	No	The paper mentions software components like ResNet18 and MoCo-v2, and K-means, but does not provide specific version numbers for these or other core software dependencies used in the experiments.
Experiment Setup	Yes	The training epochs are fixed to be 100 in our experiment for sufficient training and the learning rate is divided by 10 at 60 and 80 epoch. Other hyperparameters like initial learning rate, weight decay and λ are determined by grid search... The batch size for fine-tuning data is 256... In UOT, we set ϵ = 1.0, τ1 = 1.0 and τ2 = 100.0. The distance cost is based on the cosine similarity Cij = cos(ai,bj)+1 ϵc with ϵc = 0.01.