Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
HyperJump: Accelerating HyperBand via Risk Modelling
Authors: Pedro Mendes, Maria Casimiro, Paolo Romano, David Garlan
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Hyper Jump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups, both in sequential and parallel deployments, on a variety of deep-learning, kernel-based learning and neural architectural search problems when compared to Hyper Band and to several state-of-the-art optimizers. |
| Researcher Affiliation | Academia | Pedro Mendes1,2, Maria Casimiro1,2, Paolo Romano1, David Garlan2 1 INESC-ID and Instituto Superior T ecnico, Universidade de Lisboa 2 Software and Societal Systems Department, Carnegie Mellon University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Pseudo-code for a HJ bracket consisting of S stages, with budget b for the initial stage. [...] Algorithm 2: Pseudo-code of the logic used to determine the sets of configurations to consider when jumping from stage s to stage s + 1 (function GET CANDIDATES FOR S()) [...] Algorithm 3: Pseudo-code for the EVALUATE JUMP RISK function. |
| Open Source Code | Yes | We have made available the implementation of HJ and the benchmarks used 2. 2https://github.com/pedrogbmendes/HyperJump |
| Open Datasets | Yes | Our first benchmark, NATS-Bench (Dong et al. 2021)... We also use LIBSVM (Chang and Lin 2011) on the Covertype data set (Dua and Graff 2017)... |
| Dataset Splits | No | The paper mentions using NATS-Bench and LIBSVM datasets but does not provide specific details on the training/validation/test splits (e.g., percentages, sample counts, or methodology for splitting). |
| Hardware Specification | No | The paper discusses parallel deployments using '32 workers' but does not specify the type of hardware (e.g., GPU/CPU models, memory) used for the experiments. |
| Software Dependencies | No | The paper mentions software frameworks like Ray Tune and implementations of BOHB and HB, but it does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | We use the default parameters of BOHB and Fabolas. Similarly to HB, we set the parameter η to 3, and for fairness, when comparing HJ, HB, BOHB, and ASHA, we configure them to use the same η value. We use the default value of 10% for the threshold λ for HJ and include in the supplemental material a study on the sensitivity to the tuning of λ. |