reproducibilityindex.ai

Monte Carlo Tree Descent for Black-Box Optimization

Authors: Yaoguang Zhai, Sicun Gao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that the proposed algorithms can outperform state-of-the-art methods on many challenging benchmark problems.
Researcher Affiliation	Academia	Yaoguang Zhai UCSD Sicun Gao UCSD
Pseudocode	Yes	Algorithm 1 Monte Carlo Tree Descent (MCTD)
Open Source Code	No	The paper states: "For La-MCTS [19], we use different settings and include them in the supplementary material, as well as our MCTD approach." This statement is ambiguous and does not explicitly provide a link or state that the code for their MCTD approach is open-source or publicly available.
Open Datasets	Yes	We use several standard benchmark sets for testing BBO algorithms, from three categories: synthetic functions for nonlinear optimization, reinforcement learning problems in Mu Jo Co locomotion environments, and optimization problems in Neural Architecture Search (NAS).
Dataset Splits	No	The paper mentions evaluating on benchmarks and recording "accuracy of training and evaluation" for NAS-Bench-201, but it does not explicitly specify the training, validation, or test dataset splits (e.g., percentages, sample counts, or specific split methodology).
Hardware Specification	Yes	Benchmarks are made mainly on Google Colab with a Tesla P100 graphic card.
Software Dependencies	No	The paper mentions using "fmin2 from the CMA-ES package" and implementing "our own version of the Nelder-Mead algorithm as in [13]", but it does not provide specific version numbers for these software components or any other libraries/solvers.
Experiment Setup	Yes	We implement our own version of the Nelder-Mead algorithm as in [13], and set its expansion coefficient, contraction inside the simplex, contraction outside the simplex, and shrink coefficient as 2.0, 0.5, 0.5, and 0.5, respectively. Tu RBO [9] is initialized with 20 random samples selected using Latin Hypercube sampling, and its Automatic Relevance Determination (ARD) is set to True. Across all experiments, we set the number of evaluation calls to 3000. We set the noise scale to zero in all Mu Jo Co environments to avoid randomness in rewards.