Evaluating State-of-the-Art Classification Models Against Bayes Optimality

Authors: Ryan Theisen, Huan Wang, Lav R. Varshney, Caiming Xiong, Richard Socher

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use our approach to conduct a thorough investigation of state-of-the-art classification models, and find that in some but not all cases, these models are capable of obtaining accuracy very near optimal.
Researcher Affiliation Collaboration Ryan Theisen University of California, Berkeley theisen@berkeley.edu Huan Wang Salesforce Research huan.wang@salesforce.com Lav R. Varshney University of Illinois Urbana-Champaign varshney@illinois.edu Caiming Xiong Salesforce Research cxiong@salesforce.com Richard Socher you.com rsocher@gmail.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code can be found at https://github.com/salesforce/DataHardness.
Open Datasets Yes We train flow models4 on a wide variety of standard benchmark datasets: MNIST [19], Extended MNIST (EMNIST) [5], Fashion MNIST [36], CIFAR-10 [17], CIFAR-100 [17], SVHN [23], and Kuzushiji-MNIST [4].
Dataset Splits No The paper mentions using 60,000 training samples and 10,000 testing samples but does not specify a validation set or its size.
Hardware Specification Yes The training and evaluation are done on a workstation with 2 NVIDIA V100 GPUs.
Software Dependencies No The paper mentions using a "pytorch implementation [13] of Glow [16]" but does not specify version numbers for PyTorch or other software dependencies.
Experiment Setup Yes In all our the experiments, affine coupling layers are used, the number of steps of the flow in each level K = 16, the number of levels L = 3, and number of channels in hidden layers C = 512.