BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping

Authors: Taolin Zhang, Jinpeng Wang, Hang Guo, Tao Dai, Bin Chen, Shu-Tao Xia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted over two benchmark demonstrate the superior performance of Boost Adapter under test-time adaptation settings.
Researcher Affiliation Academia 1 Tsinghua University 2 Shenzhen University 3 Harbin Institute of Technology 4 Peng Cheng Laboratory
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes https://github.com/taolinzhang/Boost Adapter
Open Datasets Yes The OOD benchmark evaluates the model s robustness to natural distribution shifts on 4 Image Net [4] Variants, including Image Net V2 [37], Image Net-Sketch [44], Image Net-A [14] and Image Net-R [13]. We evaluate the transferring performance on 11 datasets in the Cross-Domain benchmark: Aircraft [31], Caltech101 [5], Cars [19], DTD [3], Euro SAT [11], Flower102 [32], Food101 [2], Pets [34], SUN397 [48],and UCF101 [42].
Dataset Splits Yes We follow the split in [55] and report the top-1 accuracy.
Hardware Specification Yes All our experiments are conducted with a Nvidia 3090 24GB GPU.
Software Dependencies No The paper mentions using 'a pre-trained Vi T-B/16 of CLIP as the foundation model' but does not specify software dependencies like programming language versions or library versions (e.g., PyTorch, TensorFlow) with their specific versions.
Experiment Setup Yes In test-time adaptation, the batch size is set to be 1. ... we empirically set the entropy threshold percentile to p = 0.1 and filter 64 augmented views based on random cropping to obtain the boosting samples.