Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Revisiting Continuity of Image Tokens for Cross-domain Few-shot Learning

Authors: Shuai Yi, Yixiong Zou, Yuhua Li, Ruixuan Li

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on four CDFSL benchmarks with large domain gaps show that we can outperform state-of-the-art works. Codes and models are available at https://github.com/shuaiyi308/ReCIT. Extensive experiments on four benchmark datasets validate our rationale and state-of-the-art performance.
Researcher Affiliation Academia 1School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China 2School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China. Correspondence to: Yixiong Zou <EMAIL>, Yuhua Li <EMAIL>, Ruixuan Li <EMAIL>.
Pseudocode No The paper describes methods using mathematical formulas and prose (e.g., equations 1-6 describing ViT operations, and later sections describing disruption methods), but it does not contain a distinct pseudocode block or algorithm section.
Open Source Code Yes Codes and models are available at https://github.com/shuaiyi308/ReCIT.
Open Datasets Yes Following current works (Oh et al., 2022), we employ the mini Image Net dataset (Vinyals et al., 2016) as our source domain, and target domains involve four datasets: Crop Disease (Mohanty et al., 2016), Euro SAT (Helber et al., 2019), ISIC (Codella et al., 2019), and Chest X (Wang et al., 2017).
Dataset Splits Yes Specifically, we denote the source dataset as DS = {x S i , y S i }N i=1 with x S i and y S i symbolizing the ith training sample and its corresponding label, respectively. Analogously, DT = {x T i , y T i }N i=1 represents the target dataset. During the learning and evaluation phases on DT , to ensure a fair comparison, current research (Fu et al., 2022; Zou et al., b) employs a k-way n-shot paradigm. This involves sampling from DT to construct limited datasets, known as episodes, each comprising k classes with n training samples per class. Based on these episodes, the model learns from the k n samples, collectively termed the support set {x T ij, y T ij}k,n i=1,j=1, and its performance is assessed using testing samples from the same k classes, referred to as the query set {x T q }.
Hardware Specification Yes Experiments are conducted on NVIDIA GeForce RTX 3090 GPUs.
Software Dependencies No The paper mentions using DINO pretraining and the Adam optimizer, but it does not specify software dependencies with version numbers (e.g., PyTorch version, CUDA version, Python version, specific library versions).
Experiment Setup Yes In implementation, we set the similarity threshold to 0.3. We adopt Vi T-S as our backbone network and initialize it with DINO pretraining on Image Net following (Caron et al., 2021; Fu et al., 2023; Zhang et al., 2022). Additionally, our model leverages the Adam optimizer(Kingma & Ba, 2017) for 50 epochs, with a learning rate of 10 6 assigned to the backbone network and 10 3 to the classifier, respectively.