Responsible AI (RAI) Games and Ensembles
Authors: Yash Gupta, Runtian Zhai, Arun Suggala, Pradeep Ravikumar
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the applicability and competitive performance of our techniques for solving several RAI problems, particularly around subpopulation shift. |
| Researcher Affiliation | Collaboration | Yash Gupta Carnegie Mellon University yashgup2@cs.cmu.edu Runtian Zhai Carnegie Mellon University rzhai@cs.cmu.edu Arun Suggala Google Research arunss@google.com Pradeep Ravikumar Carnegie Mellon University pradeepr@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Game play algorithm for solving Equation (1) Algorithm 2 Greedy algorithms for solving Equation (1) |
| Open Source Code | Yes | The relevant code for this work can be found at https://github.com/yashgupta-7/rai-games |
| Open Datasets | Yes | We use the following datasets: COMPAS [Angwin et al., 2016], CIFAR-10 (original, and with a class imbalanced split [Jin et al., 2021, Qi et al., 2021]) and CIFAR100. |
| Dataset Splits | Yes | We track the unregularized objective value from Equation 1 for the validation set, and whenever it increases we double the regularization factor η, which we find can improve generalization. ... For these datasets, we use the standard training and testing splits, reserving 10% of the training samples as validation data. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or memory configurations were explicitly mentioned for running the experiments. The paper only generally refers to "training & inference compute required" without specification. |
| Software Dependencies | No | The paper mentions using "SGD with momentum = 0.9 for optimization" but does not provide specific version numbers for any software, libraries, or frameworks used in the implementation. |
| Experiment Setup | Yes | We use SGD with momentum = 0.9 for optimization. We first warm up the model with some predefined epochs of ERM (3 for COMPAS and 20 for CIFAR-10/100), followed by a maximum of T = 5 base models trained from the warm-up model with sample weights provided by our algorithms. Each base model is trained for 500 iterations on COMPAS and 2000 iterations on CIFAR-10/100. The mini-batch size is set to 128. ...Our implemented versions incorporate a few alterations: 1. We track the unregularized objective value from Equation 1 for the validation set. If it increases at any round t, we increase the regularization factor η by a fixed multiple (specifically, 2). 2. The same un-regularized objective w.r.t normalized Qt is also used to perform a line search for the step size α. |