reproducibilityindex.ai

Multi-Objective Population Based Training

Authors: Arkadiy Dushatskiy, Alexander Chebykin, Tanja Alderliesten, Peter Bosman

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on diverse multi-objective hyperparameter optimization problems (Precision/Recall, Accuracy/Fairness, Accuracy/Adversarial Robustness) show that MO-PBT outperforms random search, single-objective PBT, and the state-of-the-art multi-objective hyperparameter optimization algorithm MO-ASHA.
Researcher Affiliation	Academia	1Centrum Wiskunde & Informatica, Amsterdam, the Netherlands 2Leiden University Medical Center, Leiden, the Netherlands 3Delft University of Technology, Delft, the Netherlands.
Pseudocode	Yes	Algorithm 1: Procedure to sort solutions in MO-PBT (sort Population), Algorithm 2: Exploit in MO-PBT (exploit), Algorithm 3: Explore in MO-PBT (explore)
Open Source Code	Yes	Further experimental setup details are provided in Appendix, B. The code is available at https://github. com/Arkadiy D/MO-PBT.
Open Datasets	Yes	Adult (Dua & Graff, 2017), Higgs (Baldi et al., 2014), and Click prediction (Vanschoren et al., 2013)., Celeb A dataset (Liu et al., 2015), CIFAR-10/100 datasets
Dataset Splits	Yes	Datasets are split into train/validation/test subsets before experiments. In our main results, we report the abovedescribed hypervolume metric on the validation subset to evaluate the search performance of the algorithms.
Hardware Specification	Yes	We used machines with 3 Nvidia A5000 GPUs and trained 4 networks on each GPU simultaneously, i.e., 12 networks could be trained in parallel.
Software Dependencies	No	We implemented all algorithms using Ray Tune library (Liaw et al., 2018). Network training was performed using Py Torch (Paszke et al., 2019). (Specific versions of PyTorch and Ray Tune are not provided, only citations.)
Experiment Setup	Yes	We use a population of size 32 in our main experiments, exploit-and-explore procedure every 2 epochs of training, Batch size is set to 512. The training is performed for 100 epochs. On the image datasets, we use standard for Wide Res Net (used, for instance, in (Cubuk et al., 2020)) cosine learning rate schedule with an initial learning rate 0.1 for SGD with momentum value of 0.9, and batch size 128. The training is performed for 100 epochs. For all described optimization tasks, search spaces of hyperparameters are specified in Appendix H.