GALAXY: Graph-based Active Learning at the Extreme

Authors: Jifan Zhang, Julian Katz-Samuels, Robert Nowak

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we demonstrate GALAXY s superiority over existing state-of-art deep active learning algorithms in unbalanced vision classification settings generated from popular datasets. We conduct experiments under 8 different class imbalance settings.
Researcher Affiliation Academia 1University of Wisconsin, Madison, USA. Correspondence to: Jifan Zhang <jifan@cs.wisc.edu>.
Pseudocode Yes Algorithm 1 S2: Shortest Shortest Path. Algorithm 2 Build Graph. Algorithm 3 Connect: build higher order edges. Algorithm 4 GALAXY.
Open Source Code Yes Code can be found in https://github.com/jifanz/GALAXY.
Open Datasets Yes We generate the extremely unbalanced settings for both binary and multi-class classification from popular vision datasets CIFAR-10(Krizhevsky et al., 2009), CIFAR-100(Krizhevsky et al., 2009), Path MNIST(Yang et al., 2021) and SVHN(Netzer et al., 2011).
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It mentions using 'the pool' and evaluating 'over the pool', but no specific split percentages or counts for training, validation, or test sets are detailed.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions 'training deep learning systems'.
Software Dependencies No The paper mentions 'Res Net-18 model in Py Torch' and 'Adam optimization algorithm' but does not specify version numbers for PyTorch or any other software dependencies, making it difficult to precisely reproduce the software environment.
Experiment Setup Yes We set B = 100 and T = 50. We use the Res Net-18 model in Py Torch pretrained on Image Net for initialization and cold-start the training for every labeled set L. We use the Adam optimization algorithm with learning rate of 10-2 and a fixed 500 epochs for each L. We use a cross entropy loss weighted by 1/Nk(L) for each class k.