Interpolation between Residual and Non-Residual Networks

Authors: Zonghan Yang, Yang Liu, Chenglong Bao, Zuoqiang Shi

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on a number of image classification benchmarks show that the proposed model substantially improves the accuracy of Res Net and Res Ne Xt over the perturbed inputs from both stochastic noise and adversarial attack methods.
Researcher Affiliation Academia 1Institute for Artificial Intelligence, Beijing National Research Center for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University. 2Yau Mathematical Sciences Center, Tsinghua University. 3Department of Mathematical Sciences, Tsinghua University.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any links to open-source code or explicitly state that code is available.
Open Datasets Yes We evaluate our proposed model on CIFAR-10 and CIFAR100 benchmarks, training and testing with the originally given dataset. For stochastic noise, we leverage the stochastic noise groups in CIFAR-10-C and CIFAR-100-C dataset (Hendrycks & Dietterich, 2019) for testing.
Dataset Splits No The paper mentions 'training and testing with the originally given dataset' for CIFAR-10 and CIFAR-100, which have standard train/test splits. However, it does not explicitly specify a separate validation dataset split or how it was derived.
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models.
Software Dependencies No The paper does not provide a reproducible description of ancillary software with specific version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For all of the experiments, we use SGD optimizer with batch size = 128. For Res Net and (λ-)In-Res Net experiments, we train for 160 (300) epochs for the CIFAR-10 (-100) benchmark; the learning rate starts with 0.1, and is divided it by 10 at 80 (150) and 120 (225) epochs. We apply weight decay of 1e-4 and momentum of 0.9. The parameters λn of our interpolation models are initialized by randomly sampling from U[0.2, 0.25]. In our experiments, we set α = 2/255 and iteration times M = 20.