reproducibilityindex.ai

Sequential Covariate Shift Detection Using Classifier Two-Sample Tests

Authors: Sooyong Jang, Sangdon Park, Insup Lee, Osbert Bastani

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on both synthetic and natural shifts on Image Net (Russakovsky et al., 2015) , and natural shifts on two datasets from the WILDS datasets (Koh et al., 2021). We demonstrate that our approach achieves better sample efficiency than baseline algorithms; furthermore, it satisfies the desired false positive rate. Thus, our algorithm is an effective strategy for sequential covariate shift detection.
Researcher Affiliation	Academia	1PRECISE Center, University of Pennsylvania, USA. 2School of Cybersecurity and Privacy, Georgia Institute of Technology, USA. Correspondence to: Sooyong Jang <sooyong@seas.upenn.edu>.
Pseudocode	Yes	Algorithm 1 Sequential Calibrated Classifier Two-Sample Test
Open Source Code	Yes	We have released our code for these experiments.1 https://github.com/sooyongj/sequential_covariate_shift_detection
Open Datasets	Yes	We evaluate our approach on both synthetic and natural shifts on Image Net (Russakovsky et al., 2015) , and natural shifts on two datasets from the WILDS datasets (Koh et al., 2021).
Dataset Splits	No	The paper describes setting up source and target datasets for the covariate shift detection problem (e.g., "split the original Image Net validation set into equal sized source and target datasets"), but it does not provide traditional train/validation/test splits for its own model's training process or hyperparameter tuning.
Hardware Specification	No	The paper does not mention any specific hardware used for running the experiments, such as CPU or GPU models.
Software Dependencies	No	The paper mentions software components like "SGD optimizer", "neural network", "Res Net152 model", "Res Net50 model", "Code GPT model", but it does not specify any version numbers for these or other software dependencies.
Experiment Setup	Yes	We use a fully-connected neural network with a single hidden layer (with 128 hidden units) and with the Re LU activation functions as the source-target classifier ˆgt. We use a binary cross-entropy loss for training in conjunction with an SGD optimizer with a learning rate of 0.01 (for natural shift experiments) and 0.001 (for synthetic shift experiments).