Group Robust Classification Without Any Group Information
Authors: Christos Tsirigotis, Joao Monteiro, Pau Rodriguez, David Vazquez, Aaron C. Courville
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical analysis on synthetic and real-world tasks provides evidence that our approach overcomes the identified challenges and consistently enhances robust accuracy, attaining performance which is competitive with or outperforms that of state-of-the-art methods, which, conversely, rely on bias labels for validation. |
| Researcher Affiliation | Collaboration | Christos Tsirigotis Universit e de Montr eal, Mila, Service Now Research Joao Monteiro Service Now Research Pau Rodriguez Apple MLR David Vazquez Service Now Research Aaron Courville Universit e de Montr eal, Mila, CIFAR CAI Chair |
| Pseudocode | Yes | In Algorithm 1, we provide a high-level description of the ULA methodology for training and validation of group-robust models without any bias annotations. |
| Open Source Code | Yes | In addition, a Py Torch [59] implementation is available at the following repository: https://github.com/tsirif/u LA. |
| Open Datasets | Yes | The tasks we consider are all specific instances of the setup above (see Fig. 2 and Section 2). This spans group robustness challenges like colored MNIST [57, CMNIST], corrupted CIFAR10 [37, CCIFAR10] and WATERBIRDS [62], fair classification benchmarks like CELEBA [51], and systematic generalization tasks such as the contributed SMPI3D. |
| Dataset Splits | Yes | We subsample the deployment dataset uniformly across i.i.d. combinations for 180k and 18k mutually exclusive data points for the training and validation sets respectively, for all C {2, 3, 4, 5}. Finally, the remaining data points are combined and 54k images are sampled uniformly from all combinations to create the unbiased test set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) were mentioned for the experimental setup. |
| Software Dependencies | No | The paper mentions PyTorch and solo-learn library but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Learning Rate, Batch Size, Weight Decay, and Temperature for SSL pretraining are listed in Table 7. Search space for Learning Rate, Weight Decay, η, τ, Tssl/Tstop for each dataset is described in Table 8. Best hyperparameters are selected by our proposed bias-unsupervised validation score, as shown in Table 9. |