Certifying Out-of-Domain Generalization for Blackbox Functions

Authors: Maurice G Weber, Linyi Li, Boxin Wang, Zhikuan Zhao, Bo Li, Ce Zhang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally validate our certification method on a number of datasets, ranging from Image Net, where we provide the first non-vacuous certified out-of-domain generalization, to smaller classification tasks where we are able to compare with the state-of-the-art and show that our method performs considerably better.
Researcher Affiliation Academia 1Department of Computer Science, ETH Zurich 2UIUC, USA.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks labeled ‘Algorithm’ or ‘Pseudocode’.
Open Source Code Yes Our code is publicly available at https://github.com/DS3Lab/certified-generalization.
Open Datasets Yes Image Net-1k (Russakovsky et al., 2015) containing objects of 1,000 different classes; and CIFAR-10 (Krizhevsky, 2009), which contains natural images of 10 different classes.
Dataset Splits No The paper mentions specific test set sizes for Yelp and SNLI, and describes a mixture distribution for Colored MNIST, but it does not provide comprehensive training/validation/test split information (e.g., percentages or exact counts for all splits for all datasets) needed for full reproduction of data partitioning across all experiments.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run its experiments.
Software Dependencies No The paper mentions using specific models like Efficient Net-B7 and BERT, but it does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup No The paper mentions the models used for classification, but it does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, epochs, optimizer settings) for training.