Monte Carlo Tree Descent for Black-Box Optimization
Authors: Yaoguang Zhai, Sicun Gao
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show empirically that the proposed algorithms can outperform state-of-the-art methods on many challenging benchmark problems. |
| Researcher Affiliation | Academia | Yaoguang Zhai UCSD Sicun Gao UCSD |
| Pseudocode | Yes | Algorithm 1 Monte Carlo Tree Descent (MCTD) |
| Open Source Code | No | The paper states: "For La-MCTS [19], we use different settings and include them in the supplementary material, as well as our MCTD approach." This statement is ambiguous and does not explicitly provide a link or state that the code for their MCTD approach is open-source or publicly available. |
| Open Datasets | Yes | We use several standard benchmark sets for testing BBO algorithms, from three categories: synthetic functions for nonlinear optimization, reinforcement learning problems in Mu Jo Co locomotion environments, and optimization problems in Neural Architecture Search (NAS). |
| Dataset Splits | No | The paper mentions evaluating on benchmarks and recording "accuracy of training and evaluation" for NAS-Bench-201, but it does not explicitly specify the training, validation, or test dataset splits (e.g., percentages, sample counts, or specific split methodology). |
| Hardware Specification | Yes | Benchmarks are made mainly on Google Colab with a Tesla P100 graphic card. |
| Software Dependencies | No | The paper mentions using "fmin2 from the CMA-ES package" and implementing "our own version of the Nelder-Mead algorithm as in [13]", but it does not provide specific version numbers for these software components or any other libraries/solvers. |
| Experiment Setup | Yes | We implement our own version of the Nelder-Mead algorithm as in [13], and set its expansion coefficient, contraction inside the simplex, contraction outside the simplex, and shrink coefficient as 2.0, 0.5, 0.5, and 0.5, respectively. Tu RBO [9] is initialized with 20 random samples selected using Latin Hypercube sampling, and its Automatic Relevance Determination (ARD) is set to True. Across all experiments, we set the number of evaluation calls to 3000. We set the noise scale to zero in all Mu Jo Co environments to avoid randomness in rewards. |