Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery
Authors: Zimian Wei, Peijie Dong, Zheng Hui, Anggeng Li, Lujun Li, Menglong Lu, Hengyue Pan, Dongsheng Li
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our method generalizes well to different datasets and achieves state-of-the-art results both in ranking correlation and final accuracy. We conduct extensive experiments on CIFAR-100, Flowers, Chaoyang (Zhu et al. 2021), and Image Net-1K to validate the superiority of our proposed method. |
| Researcher Affiliation | Collaboration | 1 National University of Defense Technology 2The Hong Kong University of Science and Technology (Guangzhou) 3Columbia University 4Huawei 5The Hong Kong University of Science and Technology |
| Pseudocode | Yes | Algorithm 1: Evolutionary Search for Auto-Prox Input: Search space S, population P, max iteration T , sample ratio r, sampled pool R, topk k, margin m. Output: Auto-prox with best JCM. 1: P0 := Initialize population(Pi); 2: Sample pool R := ; 3: for i = 1, 2, . . . , T do 4: Clear sample pool R := ; 5: Randomly select R P; 6: Candidates Gik := Get Topk(R, k); 7: Parent Gp i := Random Select(Gik); 8: Mutate Gm i := MUTATE(Gp i ); 9: // Elitism-Preserve Strategy. 10: if JCM(Gm i ) JCM(Gp i ) m then 11: Append Gm i to P; 12: else 13: Go to line 8; 14: end if 15: Remove the zero-cost proxy with the lowest JCM. 16: end for |
| Open Source Code | Yes | Codes can be found at https://github.com/lilujunai/Auto-Prox-AAAI24. |
| Open Datasets | Yes | First, we build the Vi T-Bench-101, which involves different Vi T candidates and their actual performance on multiple datasets. For the tiny datasets, we employ CIFAR-100 (Krizhevsky 2009), Flowers (Nilsback and Zisserman 2008), and Chaoyang (Zhu et al. 2021), while for the large-scale datasets, we focus on Image Net-1K. |
| Dataset Splits | Yes | We partition the whole Vi T-Bench-101 dataset into a validation set (60%) for proxy searching and a test set (40%) for proxy evaluation. There is no overlap between these two sets. |
| Hardware Specification | Yes | The zero-cost proxy search process is conducted on a single NVIDIA A40 GPU and occupies the memory of only one Vi T. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) are mentioned for the experimental setup. |
| Experiment Setup | Yes | In the evolutionary search process, we employ a population size of P = 20, and the total number of iterations T is set to 200. When conducting mutation, the probability of mutation for a single node in a zero-cost proxy representation is set to 0.5. The margin m in the Elitism-Preserve Strategy is 0.1. |