reproducibilityindex.ai

Online Adaptation to Label Distribution Shift

Authors: Ruihan Wu, Chuan Guo, Yi Su, Kilian Q. Weinberger

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically verify our ﬁndings under both simulated and real world label distribution shifts and show that OGD is particularly effective and robust to a variety of challenging label shift scenarios. To validate our theoretical ﬁndings, we evaluate our adaptation algorithms on CIFAR-10 [20] under simulated online label shifts, as well as on the Ar Xiv dataset1 for paper categorization, which exhibits real world label shift across years of submission history.
Researcher Affiliation	Collaboration	Ruihan Wu Cornell University rw565@cornell.edu Chuan Guo Facebook AI Research chuanguo@fb.com Yi Su Kilian Q. Weinberger Cornell University {ys756, kqw4@cornell.edu}
Pseudocode	Yes	Framework 1 The general framework for online label shift adaptation. Algorithm 3 Gradient estimator for pℓ(p; q)
Open Source Code	No	The paper does not provide any explicit statements or links to the source code for the methodology described. While it references the ArXiv dataset from Kaggle, this is a dataset, not code for their algorithms.
Open Datasets	Yes	To validate our theoretical ﬁndings, we evaluate our adaptation algorithms on CIFAR-10 [20] under simulated online label shifts, as well as on the Ar Xiv dataset1 for paper categorization, which exhibits real world label shift across years of submission history. 1https://www.kaggle.com/Cornell-University/arxiv [20] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
Dataset Splits	Yes	We divide the original training set into train and validation by a ratio of 3 : 2. The training set is used to train the base model f0, and the validation set D0 is used for both temperature scaling calibration [12] and to estimate the confusion matrix.
Hardware Specification	No	The paper does not specify any hardware (e.g., GPU, CPU models, memory) used for running the experiments. It mentions using a ResNet-18 classifier, which is a model architecture, not a hardware specification.
Software Dependencies	No	The paper mentions machine learning libraries implicitly (e.g., for ResNet-18, multinomial regressor) but does not provide specific software names with version numbers required for reproduction.
Experiment Setup	Yes	For OGD, we use the learning rate η = q 2 T 1 L suggested by Theorem 2, where L is estimated by taking the maximum over {ey : y Y} for 100 vectors p uniformly sampled from M 1. We use three different window lengths w = 100, 1000, 10000 in our experiments. In our experiments, q(1) and q(2) are deﬁned to concentrate on the dog and cat classes, respectively. That is, q(1)[dog] = 0.55 and q(1)[y] = 0.05 for all other classes y, and similarly for q(2). The end time T is set to 100, 000 for all simulation experiments. All results are repeated using three different random seeds that randomize the samples drawn at each time step t.