reproducibilityindex.ai

On the Importance of Feature Separability in Predicting Out-Of-Distribution Error

Authors: RENCHUNZI XIE, Hongxin Wei, Lei Feng, Yuzhou Cao, Bo An

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	we investigate this problem from the perspective of feature separability empirically and theoretically. Specifically, we propose a dataset-level score based upon feature dispersion to estimate the test accuracy under distribution shift. Our analysis shows that inter-class dispersion is strongly correlated with the model accuracy, while intra-class compactness does not reflect the generalization performance on OOD data. Extensive experiments demonstrate the superiority of our method in both prediction performance and computational efficiency.
Researcher Affiliation	Academia	1 School of Computer Science and Engineering, Nanyang Technological University, Singapore 2 Department of Statistics and Data Science, Southern University of Science and Technology, China {xier0002,yuzhou002}@e.ntu.edu.sg weihx@sustech.edu.cn lfengqaq@gmail.com boan@ntu.edu.sg
Pseudocode	Yes	Algorithm 1 OOD Error Estimation via Dispersion Score
Open Source Code	No	The paper does not include any explicit statements about releasing source code for the methodology or provide a link to a code repository.
Open Datasets	Yes	Train datasets. During training, we train models on the CIFAR-10, CIFAR-100 [Krizhevsky et al., 2009] and Tiny Image Net [Le and Yang, 2015] datasets.
Dataset Splits	No	The paper specifies training and test datasets. While it mentions using 'subsets of CIFAR-10C' for sample size analysis, it does not describe dedicated validation splits for model training or hyperparameter tuning that are reproducible.
Hardware Specification	No	The paper does not specify the particular hardware used for running the experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies	No	The paper mentions using SGD and cosine learning rate decay, but it does not specify software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or their specific version numbers.
Experiment Setup	Yes	Training details. During the training process, we train Res Net18, Res Net50 [He et al., 2016] and WRN-50-2 [Zagoruyko and Komodakis, 2016] on CIFAR-10, CIFAR-100 [Krizhevsky et al., 2009] and Tiny Image Net [Le and Yang, 2015] with 20, 50 and 50 epochs, respectively. We use SGD with the learning rate of 10 3, cosine learning rate decay [Loshchilov and Hutter, 2016], a momentum of 0.9 and a batch size of 128 to train the model.