Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts
Authors: Jiang-Xin Shi, Tong Wei, Zhi Zhou, Jie-Jing Shao, Xin-Yan Han, Yu-Feng Li
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments clearly verify that both the training time and the learned parameters are significantly reduced with more accurate predictive performance compared with state-of-the-art approaches. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China 3School of Computer Science and Engineering, Southeast University, China 4Key Laboratory of Computer Network and Information Integration, Southeast University, Ministry of Education, China. |
| Pseudocode | Yes | Algorithm 1 TEST-TIME ENSEMBLING |
| Open Source Code | Yes | The implementation code is available at https://github.com/shijxcs/LIFT. |
| Open Datasets | Yes | We conduct experiments on four long-tail datasets, including Image Net-LT (Liu et al., 2019), Places-LT (Liu et al., 2019), i Naturalist 2018 (Van Horn et al., 2018) and CIFAR-100-LT (Cao et al., 2019). |
| Dataset Splits | No | The paper describes evaluation across head, medium, and tail classes and mentions training epochs but does not specify training/validation/test dataset splits (e.g., percentages or counts) or reference predefined splits for reproducibility. |
| Hardware Specification | Yes | All experiments are conducted on a single NVIDIA A800 GPU. |
| Software Dependencies | No | The paper mentions using the SGD optimizer with specific parameters but does not list any specific software dependencies (e.g., Python, PyTorch, or TensorFlow versions) required for replication. |
| Experiment Setup | Yes | For all experiments, we use the SGD optimizer with a batch size of 128, weight decay of 5 10 4, and momentum of 0.9. For lightweight fine-tuning methods, the learning rate is 0.01. For full fine-tuning, we search the learning rate from {0.02, 0.01, 0.005, 0.002, 0.001, 0.0005} considering its weak stability. For Image Net-LT, Places-LT, and CIFAR-100-LT, we train the model for only 10 epochs; and for i Naturalist 2018, we train 20 epochs considering that it has much more data. |