Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Vision Transformers as Probabilistic Expansion from Learngene
Authors: Qiufeng Wang, Xu Yang, Haokun Chen, Xin Geng
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments demonstrate the effectiveness of PEG and outperforming traditional initialization strategies. |
| Researcher Affiliation | Academia | 1 School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about the release of source code or a direct link to a code repository. |
| Open Datasets | Yes | Datasets. After initializing the descendant models with the learngene, we fine-tune them on various downstream tasks, including Oxford Flowers (Nilsback & Zisserman, 2008), CUB-200-2011 (Wah et al., 2011), Stanford Cars (Gebru et al., 2017), CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 (Krizhevsky et al., 2009), Food101 (Bossard et al., 2014), i Naturalist-2019 (Tan et al., 2019), Image Net1K (Deng et al., 2009). For detailed dataset descriptions, see Appendix A. |
| Dataset Splits | Yes | Table 6. Characteristics of the downstream datasets. Dataset # Total #Training #Validation #Testing #Classes. Oxford Flowers... 8,189 1,020 1,020 6,149 102 |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., specific GPU/CPU models). |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | Training settings. During the learngene expanding phase, we train the learnable parameters for 100 epochs before expanding them into descendant models of elastic scales. After this, we fine-tune these descendant models on downstream tasks for 500 epochs, which includes a 10-epoch warm-up period. The only exception is i Naturalist-2019, where we train for 100 epochs with a 5-epoch warm-up. For all tasks, the initial learning rate is set to 5 10 4 and a weight decay of 0.05 is applied. |