reproducibilityindex.ai

GALAXY: Graph-based Active Learning at the Extreme

Authors: Jifan Zhang, Julian Katz-Samuels, Robert Nowak

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we demonstrate GALAXY s superiority over existing state-of-art deep active learning algorithms in unbalanced vision classification settings generated from popular datasets. We conduct experiments under 8 different class imbalance settings.
Researcher Affiliation	Academia	1University of Wisconsin, Madison, USA. Correspondence to: Jifan Zhang <jifan@cs.wisc.edu>.
Pseudocode	Yes	Algorithm 1 S2: Shortest Shortest Path. Algorithm 2 Build Graph. Algorithm 3 Connect: build higher order edges. Algorithm 4 GALAXY.
Open Source Code	Yes	Code can be found in https://github.com/jifanz/GALAXY.
Open Datasets	Yes	We generate the extremely unbalanced settings for both binary and multi-class classification from popular vision datasets CIFAR-10(Krizhevsky et al., 2009), CIFAR-100(Krizhevsky et al., 2009), Path MNIST(Yang et al., 2021) and SVHN(Netzer et al., 2011).
Dataset Splits	No	The paper does not explicitly provide training/validation/test dataset splits. It mentions using 'the pool' and evaluating 'over the pool', but no specific split percentages or counts for training, validation, or test sets are detailed.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions 'training deep learning systems'.
Software Dependencies	No	The paper mentions 'Res Net-18 model in Py Torch' and 'Adam optimization algorithm' but does not specify version numbers for PyTorch or any other software dependencies, making it difficult to precisely reproduce the software environment.
Experiment Setup	Yes	We set B = 100 and T = 50. We use the Res Net-18 model in Py Torch pretrained on Image Net for initialization and cold-start the training for every labeled set L. We use the Adam optimization algorithm with learning rate of 10-2 and a fixed 500 epochs for each L. We use a cross entropy loss weighted by 1/Nk(L) for each class k.