A Closer Look at the CLS Token for Cross-Domain Few-Shot Learning
Authors: Yixiong Zou, Shuai Yi, Yuhua Li, Ruixuan Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four benchmarks validate our rationale and state-of-the-art performance. |
| Researcher Affiliation | Academia | Yixiong Zou1 Shuai Yi2 Yuhua Li1 Ruixuan Li1 1School of Computer Science and Technology, 2School of Artificial Intelligence and Automation Huazhong University of Science and Technology |
| Pseudocode | No | Not found. The paper describes the method narratively but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our codes are available at https://github.com/Zoilsen/CLS_Token_CDFSL. |
| Open Datasets | Yes | Following current works [28, 33], we utilize the mini Image Net dataset [34] as our source domain with around 60k images and 100 annotated classes. We train our models on the training split of the source dataset, then finetune and evaluate the generalization performance on four target-domain datasets, Crop Disease [25], Euro SAT [16], ISIC [5], and Chest X [38], which are cross-domain datasets from the domain of agriculture, remote sensing and medical data (with significant domain gaps). |
| Dataset Splits | Yes | During the learning on each target dataset, the model will be applied to learn from the sampled support set {x T ij, y T ij}K,M i=1,j=1 which is called the K-way M-shot task (i.e., K classes in each support set with M samples in each class). Finally, the model will be evaluated on the query set {x T q }. |
| Hardware Specification | Yes | Experiments are conducted on NVIDIA A5000 GPUs. |
| Software Dependencies | No | Not found. The paper does not specify version numbers for ancillary software dependencies. |
| Experiment Setup | Yes | In implementation, we set the domain number to 64, i.e., each source-domain class has a specific domain token. We set λ to 100 to keep two losses on the same scale. We follow [1, 13, 42] to take DINO on Image Net as the pretraining of our backbone network, and scale the learning rate of the domain token to 1% of the backbone network. |