reproducibilityindex.ai

Low-Budget Active Learning via Wasserstein Distance: An Integer Programming Approach

Authors: Rafid Mahmood, Sanja Fidler, Marc T Law

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical results on several data sets show that our optimization approach is competitive with baselines and particularly outperforms them in the low budget regime where less than one percent of the data set is labeled.
Researcher Affiliation	Collaboration	1NVIDIA 2University of Toronto 3Vector Institute
Pseudocode	Yes	In Appendix D.2, we provide a full algorithm description and Python pseudo-code. Finally, note that the techniques proposed here are not exhaustive and more can be used to improve the algorithm.
Open Source Code	No	The paper uses and cites external open-source libraries and solvers (e.g., Python Optimal Transport, COIN-OR/CBC Solver) and mentions using the available code of a baseline (WAAL), but it does not provide an explicit statement or link for the open-sourcing of the authors' own implementation code for the methodology described in this paper.
Open Datasets	Yes	We evaluate active learning in classiﬁcation on three data sets: STL-10 (Coates et al., 2011), CIFAR10 (Krizhevsky, 2009), and SVHN (Netzer et al., 2011). Our domain adaptation experiments use the Ofﬁce-31 data set (Saenko et al., 2010).
Dataset Splits	No	For each experiment, we randomly partition the target data set into 70% for training and 30% for testing. (This applies only to the domain adaptation experiments. For image classification, it mentions "default test sets" and general training, but no explicit training/validation split details for reproducibility are given in the main text or appendix.)
Hardware Specification	Yes	All experiments are performed on computers with a single NVIDIA V100 GPU card and 50 GB CPU memory.
Software Dependencies	No	The paper mentions using the Python Optimal Transport library (Flamary et al., 2021) and the COIN-OR/CBC Solver (Forrest & Lougee-Heimer, 2005), as well as Python-MIP, but it does not specify exact version numbers for these software dependencies, which is required for full reproducibility.
Experiment Setup	Yes	We set the solver optimality gap tolerance to 10% and a time limit of 180 seconds to solve W-RMP(Λ) in each iteration. We also warm-start our algorithm with the solution to the k-centers core set... We run Algorithm 1 with ε = 10^-3 and T = 3 hours for STL-10, CIFAR-10, and Ofﬁce-31, and T = 6 hours for SVHN... for Wass. + EOC + P, we set (β+, β ) = (0.6, 0.99).