Measuring Generalization with Optimal Transport

Authors: Ching-Yao Chuang, Youssef Mroueh, Kristjan Greenewald, Antonio Torralba, Stefanie Jegelka

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets. Theoretically, we demonstrate that the concentration and separation of features play crucial roles in generalization, supporting empirical results in the literature. The code is available at https://github.com/chingyaoc/kV-Margin.
Researcher Affiliation Collaboration MIT CSAIL, IBM Research AI, MIT-IBM Watson AI Lab
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/chingyaoc/kV-Margin.
Open Datasets Yes We evaluate our margin bounds on the Predicting Generalization in Deep Learning (PGDL) dataset [26]. ... on CIFAR-10 [30] ... SVHN [41], and MNIST [33]
Dataset Splits No The paper mentions using 'random subsets sampled from the training data' for estimation but does not provide specific train/validation/test split percentages or sample counts to reproduce data partitioning, nor does it refer to pre-defined splits for this purpose.
Hardware Specification Yes All of our experiments are run on 6 TITAN X (Pascal) GPUs.
Software Dependencies No The paper mentions the use of the 'POT library [18]' and cites other general tools like TensorFlow, Keras, PyTorch, and scikit-learn in the references, but it does not provide specific version numbers for any of these software dependencies as used in the experiments.
Experiment Setup Yes To ease the computational cost, all margins and k-variances are estimated with random subsets of size min(200 #classes, data_size) sampled from the training data. The average results over 4 subsets are shown in Table 1. ... To produce a scalar measurement, we use the median to summarize the margin distribution, which can be interpreted as finding the margin γ that makes the margin loss 0.5.