reproducibilityindex.ai

Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning

Authors: Kai Gan, Tong Wei

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we demonstrate that FINESSL sets a new state of the art for SSL on multiple benchmark datasets, reduces the training cost by over six times, and can seamlessly integrate various fine-tuning and modern SSL algorithms. We conduct extensive experiments on five publicly available datasets to evaluate the performance of FINESSL.
Researcher Affiliation	Academia	Kai Gan 1 Tong Wei 1 1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China. Correspondence to: Tong Wei <weit@seu.edu.cn>.
Pseudocode	Yes	Algorithm 1 The Proposed FINESSL
Open Source Code	Yes	The source code is available at https: //github.com/Gank0078/Fine SSL.
Open Datasets	Yes	The datasets are CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), FOOD-101 (Bossard et al., 2014), Semi-Aves (Su et al., 2021), and Image Net (Deng et al., 2009).
Dataset Splits	No	No explicit mention of a separate validation split or its size/proportion for hyperparameter tuning or early stopping.
Hardware Specification	Yes	All experiments are conducted in Py Torch with a single NVIDIA RTX 3090 24GB GPU.
Software Dependencies	No	All experiments are conducted in Py Torch with a single NVIDIA RTX 3090 24GB GPU.
Experiment Setup	Yes	We employ the Stochastic Gradient Descent (SGD) optimizer with a learning rate of 0.03, utilizing a batch size of 32 alongside a weight decay set at 5 10 4, and a momentum factor of 0.9. We fine-tune the model for 30 epochs, with each epoch comprising 500 steps. Table 8. The default parameter configs employed in the experiments. Configuration Default Value Optimizer SGD Learning rate 0.03 Scheduler for lr Cosine decay Weight decay 5 10 4 Momentum factor 0.9 Batch Size 32 Model CLIP-Vi T Epochs 30 Steps 500 µ 1 PEFT strategy VPT deep VPT length 50 λ 0.5 α 8.0 γ 3.0