Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Revisiting Continuity of Image Tokens for Cross-domain Few-shot Learning
Authors: Shuai Yi, Yixiong Zou, Yuhua Li, Ruixuan Li
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four CDFSL benchmarks with large domain gaps show that we can outperform state-of-the-art works. Codes and models are available at https://github.com/shuaiyi308/ReCIT. Extensive experiments on four benchmark datasets validate our rationale and state-of-the-art performance. |
| Researcher Affiliation | Academia | 1School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 2School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China. Correspondence to: Yixiong Zou <EMAIL>, Yuhua Li <EMAIL>, Ruixuan Li <EMAIL>. |
| Pseudocode | No | The paper describes methods using mathematical formulas and prose (e.g., equations 1-6 describing ViT operations, and later sections describing disruption methods), but it does not contain a distinct pseudocode block or algorithm section. |
| Open Source Code | Yes | Codes and models are available at https://github.com/shuaiyi308/ReCIT. |
| Open Datasets | Yes | Following current works (Oh et al., 2022), we employ the mini Image Net dataset (Vinyals et al., 2016) as our source domain, and target domains involve four datasets: Crop Disease (Mohanty et al., 2016), Euro SAT (Helber et al., 2019), ISIC (Codella et al., 2019), and Chest X (Wang et al., 2017). |
| Dataset Splits | Yes | Specifically, we denote the source dataset as DS = {x S i , y S i }N i=1 with x S i and y S i symbolizing the ith training sample and its corresponding label, respectively. Analogously, DT = {x T i , y T i }N i=1 represents the target dataset. During the learning and evaluation phases on DT , to ensure a fair comparison, current research (Fu et al., 2022; Zou et al., b) employs a k-way n-shot paradigm. This involves sampling from DT to construct limited datasets, known as episodes, each comprising k classes with n training samples per class. Based on these episodes, the model learns from the k n samples, collectively termed the support set {x T ij, y T ij}k,n i=1,j=1, and its performance is assessed using testing samples from the same k classes, referred to as the query set {x T q }. |
| Hardware Specification | Yes | Experiments are conducted on NVIDIA GeForce RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions using DINO pretraining and the Adam optimizer, but it does not specify software dependencies with version numbers (e.g., PyTorch version, CUDA version, Python version, specific library versions). |
| Experiment Setup | Yes | In implementation, we set the similarity threshold to 0.3. We adopt Vi T-S as our backbone network and initialize it with DINO pretraining on Image Net following (Caron et al., 2021; Fu et al., 2023; Zhang et al., 2022). Additionally, our model leverages the Adam optimizer(Kingma & Ba, 2017) for 50 epochs, with a learning rate of 10 6 assigned to the backbone network and 10 3 to the classifier, respectively. |