Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation

Authors: Yongxian Wei, Zixuan Hu, Li Shen, Zhenyi Wang, Chun Yuan, Dacheng Tao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments showcase the superiority of our approach across twelve customized tasks, achieving a 9.33% improvement compared to existing DFKD methods.
Researcher Affiliation	Academia	1Tsinghua University, China 2Nanyang Technological University, Singapore 3Shenzhen Campus of Sun Yat-sen University, China 4University of Maryland, College Park, USA
Pseudocode	No	The paper includes equations and figures illustrating the framework (e.g., Figure 1), but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper mentions that DFKD 'elegantly resolves these issues with open-sourced pre-trained models' and that 'Our method is an open-vocabulary, customized approach suitable for any category recognized by CLIP.' However, it does not explicitly state that the authors are releasing the code for their described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We perform model inversion from texts sourced from datasets including Caltech-101 (Fei-Fei et al., 2004) (101 categories), Image Net-1K (Deng et al., 2009) (1000 categories), or Flower-102 (Nilsback & Zisserman, 2008) (102 fine-grained categories).
Dataset Splits	No	We randomly divide Image Net-1K into 10 splits to simulate a real customization scenario as closely as possible, reporting average results to demonstrate the robustness of our method. Each task includes over 100 categories encompassing a wide range of natural categories. Further details regarding data statistics are provided in App. H. The student model is evaluated on these datasets, including the test set of Image Net, and the complete datasets of Caltech-101 and Flower-102, with the classification accuracy (in %) reported. While the paper mentions random division into splits and evaluating on test sets, it does not provide specific percentages, sample counts, or explicit train/validation/test splits needed for reproduction beyond stating that Image Net-1K is divided into 10 splits and evaluated on its test set.
Hardware Specification	Yes	SDD is a preliminary step with low complexity, taking only 57 seconds to train on RTX 4090.
Software Dependencies	No	The paper mentions using VQGAN and CLIP, but does not specify version numbers for these or other software libraries/frameworks (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	The batch size for prompt learning is set to 64, with a learning rate of 0.01. Surrogate images are synthesized with a resolution of 224 × 224, and optimized using the Adam optimizer with a learning rate of 0.1 for 400 iterations. For text-based customization, 64 images are generated per class. For image-based customization, each class has 4 example images, and 24 additional images are synthesized per class. The inner loop learning rate α and outer loop learning rate for meta knowledge distillation are both set to 0.001, utilizing the SGD optimizer.