Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms

Authors: Dheeraj Baby, Saurabh Garg, Tzu-Ching Yen, Sivaraman Balakrishnan, Zachary Lipton, Yu-Xiang Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments across numerous simulated and real-world online label shift scenarios demonstrate the superior performance of our proposed approaches, often achieving 1-3% improvement in accuracy while being sample and computationally efficient.
Researcher Affiliation Academia Dheeraj Baby UC Santa Barbara dheeraj@ucsb.edu Saurabh Garg Carnegie Mellon University sgarg2@andrew.cmu.edu Tzu-Ching Yen Carnegie Mellon University tzuchiny@andrew.cmu.edu Sivaraman Balakrishnan Carnegie Mellon University sbalakri@andrew.cmu.edu Zachary C. Lipton Carnegie Mellon University zlipton@andrew.cmu.edu Yu-Xiang Wang UC Santa Barbara yuxiangw@cs.ucsb.edu
Pseudocode Yes Algorithm 2 Regress And Reweight to handle UOLS Algorithm 4 Train By Weights to handle SOLS Algorithm 5 LPA: a black-box reduction to produce a low-switching online regression algorithm
Open Source Code Yes Code is publicly available at this url. Code is publicly available at https://github.com/Anon-djiwh/Online Label Shift.
Open Datasets Yes Setup Following the dataset setup of Bai et al. [8], we conducted experiments on synthetic and common benchmark data such as MNIST [50], CIFAR-10 [49], Fashion [77], Euro SAT [40], Arxiv [15], and SHL [31, 71].
Dataset Splits Yes For all the datasets above, the initial offline data are further split by 80 : 20 into training and holdout data, where the former is used for offline training of the base model and the latter for computing the confusion matrix and retraining (e.g. updating the linear head parameters with UOGD or updating the softmax prediction with our FLT-FTL) during online learning. We observe N = 50 examples at every iteration and we split the observed labeled examples into 80:20 split for training and validation.
Hardware Specification No The paper details the neural network architectures used (e.g., MLP, ResNet18, Distil BERT) and their training configurations, but it does not specify any hardware components such as CPU or GPU models, or memory.
Software Dependencies No The paper describes various model architectures and training parameters, but it does not specify any software dependencies with version numbers (e.g., PyTorch, TensorFlow, or specific library versions).
Experiment Setup Yes It is trained for a single epoch with learning rate 0.1, momentum 0.9, batch size 200, and l2 regularization 1e-4. It is finetuned for 70 epochs with learning rate 0.1, momentum 0.9, batch size 200, and l2 regularization 1e-4. The learning rate decayed by 90% at the 25th and 40th epochs.