Evaluating Efficient Performance Estimators of Neural Architectures
Authors: Xuefei Ning, Changcheng Tang, Wenshuo Li, Zixuan Zhou, Shuang Liang, Huazhong Yang, Yu Wang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we conduct an extensive and organized assessment of OSEs and ZSEs on five NAS benchmarks: NAS-Bench-101/201/301, and NDS Res Net/Res Ne Xt-A. Specifically, we employ a set of NAS-oriented criteria to study the behavior of OSEs and ZSEs, and reveal their biases and variances. |
| Researcher Affiliation | Collaboration | Department of Electronic Engineering, Tsinghua University1 Novauto Technology Co. Ltd.2 |
| Pseudocode | No | The paper does not contain any sections explicitly labeled as 'Pseudocode' or 'Algorithm', nor are there structured, code-like blocks describing a procedure. |
| Open Source Code | Yes | The code is available at https://github. com/walkerning/aw_nas [24]. |
| Open Datasets | Yes | In this paper, we conduct an extensive and organized assessment of OSEs and ZSEs on five NAS benchmarks: NAS-Bench-101/201/301, and NDS Res Net/Res Ne Xt-A. |
| Dataset Splits | Yes | We inspect OSEs ranking quality when using different numbers of validation data batches to evaluate the OS scores, and find that on both NB201/NB301, using more data improves the estimation quality. Specifically, we compute the average OS accuracies over N validation batches, where each batch contains 128 examples. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Unless otherwise noted, MC sample S=1 is used in the experiments. And all training and evaluation settings are summarized in Appendix D. ... Specifically, we compute the average OS accuracies over N validation batches, where each batch contains 128 examples. |