Boundary Matters: A Bi-Level Active Finetuning Method
Authors: Han Lu, Yichen Xie, Xiaokang Yang, Junchi Yan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments provide qualitative and quantitative evidence of our method s superior efficacy, consistently outperforming the existing baselines. |
| Researcher Affiliation | Academia | Han Lu1, Yichen Xie2, Xiaokang Yang1, Junchi Yan1, 1 Dept. of CSE & School of AI & Moe Key Lab of AI, Shanghai Jiao Tong University 2University of California, Berkeley |
| Pseudocode | Yes | Algorithm 1 Pseudo-code for Bi LAF |
| Open Source Code | Yes | https://github.com/Thinklab-SJTU/Bi LAF |
| Open Datasets | Yes | Firstly, we evaluate our method using three widely recognized classification datasets: CIFAR10, CIFAR100 [22], and Image Net-1k [32]. |
| Dataset Splits | Yes | Both CIFAR10 and CIFAR100 contain 60,000 images with resolutions of 32x32... Each comprises 50,000 images for training and 10,000 for testing. The large-scale dataset Image Net-1k includes 1,000 categories and a total of 1,281,167 training images along with 50,000 validation images. |
| Hardware Specification | Yes | All experiments were conducted using Ge Force RTX 3090(24G) GPUs and Intel(R) Core(TM) i9-10920X CPUs. |
| Software Dependencies | No | In the core samples selection stage, we utilize Active FT and optimize the parameters θS using the Adam [21] optimizer (learning rate 1e-3) until convergence. Our experiments are implemented using the mmclassification framework, mmdetection framework, and mmsegmentation framework. However, specific version numbers for these software dependencies are not provided. |
| Experiment Setup | Yes | In the core samples selection stage, we utilize Active FT and optimize the parameters θS using the Adam [21] optimizer (learning rate 1e-3) until convergence. We set the core number K as 50(0.1%), 250(0.5%), 6405(0.5%) for CIFAR10, CIFAR100 and Image Net separately. In the boundary samples selection stage, we consistently set nearest neighbors number k as 10, both removal ratio Prm and clustering fraction Pin as 10%, opponent penalty coefficient δ as 1.1. In the supervised finetuning phase, we finetune the models using the SGD optimizer with learning rate as 3e-3, weight decay as 1e-4 and momentum as 0.9. We employ cosine learning rate decay with a batch size of 256 distributed across two GPUs. The models are finetuned for 1000 epochs on all datasets with different sampling ratios, except for Image Net with sampling ratio 5%, where we finetune for 300 epochs. |