reproducibilityindex.ai

AHA: Human-Assisted Out-of-Distribution Generalization and Detection

Authors: Haoyue Bai, Jifan Zhang, Robert Nowak

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments validate the efficacy of our framework. We observed that with only a few hundred human annotations, our method significantly outperforms existing state-of-the-art methods that do not involve human assistance, in both OOD generalization and OOD detection. ... Extensive experiments and ablation studies demonstrate the effectiveness of our human-assisted method.
Researcher Affiliation	Academia	Haoyue Bai, Jifan Zhang, Robert Nowak University of Wisconsin-Madison {baihaoyue, jifan}@cs.wisc.edu, rdnowak@wisc.edu
Pseudocode	Yes	Algorithm 1 AHA: Adaptive Human Assisted labeling for OOD learning
Open Source Code	Yes	Code is publicly available at https://github.com/Haoyue Bai ZJU/aha.
Open Datasets	Yes	Following the benchmark in literature of [6], we use the CIFAR10 [60] as Pin and CIFAR-10-C [45] with Gaussian additive noise as the Pcovariate out for our main experiments. ... For semantic OOD data (Psemantic out ), we utilize natural image datasets including SVHN [72], Textures [19], Places365 [113], LSUN-Crop [103], and LSUN-Resize [103]. Additionally, we provide results on the PACS dataset [64] from Domain Bed.
Dataset Splits	Yes	To compile the wild data, we divide the ID set into 50% labeled as ID (in-distribution) and 50% unlabeled. We then mix unlabeled ID, covariate OOD, and semantic OOD data for our experiments. ... Within the training/validation split, 70% of the data is used for training, and the remaining 30% is used for validation.
Hardware Specification	Yes	Experiments are performed using Tesla V100.
Software Dependencies	Yes	Our framework was implemented using Py Torch 2.0.1.
Experiment Setup	Yes	For CIFAR experiments, we adopt a Wide Res Net [104] with 40 layers and a widen factor of 2. For optimization, we use stochastic gradient descent with Nesterov momentum [27], including a weight decay of 0.0005 and a momentum of 0.09. The batch size is set to 128, and the initial learning rate is 0.1, with cosine learning rate decay. The model is initialized with a pre-trained network on CIFAR-10 and trained for 100 epochs using our objective from Equation 4, with α = 10. We set a default labeling budget k of 1000 for the benchmarking results and provide an analysis of different labeling budgets 100, 500, 1000, 2000 in Section 5.3.