Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
CF-ViT: A General Coarse-to-Fine Method for Vision Transformer
Authors: Mengzhao Chen, Mingbao Lin, Ke Li, Yunhang Shen, Yongjian Wu, Fei Chao, Rongrong Ji
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the efficacy of our CF-Vi T. For example, without any compromise on performance, CF-Vi T reduces 53% FLOPs of LV-Vi T, and also achieves 2.01 throughput. |
| Researcher Affiliation | Collaboration | 1MAC Lab, Department of Artificial Intelligence, Xiamen University 2Institute of Artificial Intelligence, Xiamen University 3Tencent Youtu Lab |
| Pseudocode | No | The paper includes figures illustrating the model architecture (e.g., Figure 2, Figure 3, Figure 4) but does not provide any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code of this project is at https://github.com/Chen Mn Z/CF-Vi T. |
| Open Datasets | Yes | We conduct the experiments on Image Net (Deng et al. 2009) |
| Dataset Splits | Yes | We feed the model 50,000 images in the validation set of Image Net with a batch size of 1,024, and record the total inference time. [...] We conduct a toy experiment on the validation set of Image Net (Deng et al. 2009) with a pre-trained Dei T-S model (Touvron et al. 2021a). |
| Hardware Specification | Yes | Our CF-Vi T model is trained on a workstation with 4 A100 GPUs. [...] The model throughput is measured as the number of processed images per second on a single A100 GPU. |
| Software Dependencies | No | The paper states 'All training settings of our CF-Vi T, such as image processing, learning rate, etc, are to follow these of Dei T and LV-Vi T.' but does not list any specific software libraries or frameworks with their version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA 11.x). |
| Experiment Setup | Yes | In the training phase, only conducting the fine-grained splitting at informative regions would affect the convergence. Therefore, we split the entire image into fine-grained patches in the first 200 epochs, and select informative coarse patches for fine-grained splitting in the remaining training process. [...] We feed the model 50,000 images in the validation set of Image Net with a batch size of 1,024, and record the total inference time. [...] we set the confidence threshold η = 1... [...] For a trade-off, we set α to 0.5 in our implementation. [...] ak = β ak 1 + (1 β) a0 k, where β = 0.99. |