Monte Carlo Tree Descent for Black-Box Optimization

Authors: Yaoguang Zhai, Sicun Gao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show empirically that the proposed algorithms can outperform state-of-the-art methods on many challenging benchmark problems.
Researcher Affiliation Academia Yaoguang Zhai UCSD Sicun Gao UCSD
Pseudocode Yes Algorithm 1 Monte Carlo Tree Descent (MCTD)
Open Source Code No The paper states: "For La-MCTS [19], we use different settings and include them in the supplementary material, as well as our MCTD approach." This statement is ambiguous and does not explicitly provide a link or state that the code for their MCTD approach is open-source or publicly available.
Open Datasets Yes We use several standard benchmark sets for testing BBO algorithms, from three categories: synthetic functions for nonlinear optimization, reinforcement learning problems in Mu Jo Co locomotion environments, and optimization problems in Neural Architecture Search (NAS).
Dataset Splits No The paper mentions evaluating on benchmarks and recording "accuracy of training and evaluation" for NAS-Bench-201, but it does not explicitly specify the training, validation, or test dataset splits (e.g., percentages, sample counts, or specific split methodology).
Hardware Specification Yes Benchmarks are made mainly on Google Colab with a Tesla P100 graphic card.
Software Dependencies No The paper mentions using "fmin2 from the CMA-ES package" and implementing "our own version of the Nelder-Mead algorithm as in [13]", but it does not provide specific version numbers for these software components or any other libraries/solvers.
Experiment Setup Yes We implement our own version of the Nelder-Mead algorithm as in [13], and set its expansion coefficient, contraction inside the simplex, contraction outside the simplex, and shrink coefficient as 2.0, 0.5, 0.5, and 0.5, respectively. Tu RBO [9] is initialized with 20 random samples selected using Latin Hypercube sampling, and its Automatic Relevance Determination (ARD) is set to True. Across all experiments, we set the number of evaluation calls to 3000. We set the noise scale to zero in all Mu Jo Co environments to avoid randomness in rewards.