Show Me the Way! Bilevel Search for Synthesizing Programmatic Strategies
Authors: David S. Aleixo, Levi H.S. Lelis
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our bilevel algorithm in Micro RTS, a real-time strategy game. Our results show that the bilevel search synthesizes stronger strategies than methods that search only in the program space. Also, the strategies our method synthesizes obtained the highest winning rate in a simulated tournament with several baseline agents, including the best agents from the two latest Micro RTS competitions. |
| Researcher Affiliation | Academia | 1Departamento de Inform atica, Universidade Federal de Vic osa, Brazil 2Department of Computing Science, Alberta Machine Intelligence Institute (Amii), University of Alberta, Canada |
| Pseudocode | Yes | Algorithm 1 shows Bi-S s pseudocode. |
| Open Source Code | Yes | 1https://github.com/dsaleixo/Bilevel Searchfor Synthesizing. |
| Open Datasets | Yes | We use Micro RTS, a real-time strategy game widely used to evaluate intelligent systems (Onta n on 2020). It is a two-player zero-sum game where each player controls a set of units in real time... We use the maps bases Workers24x24A, bases Workers32x32A, and (4)Blood Bath.scm B, whose sizes are 24 24, 32 32, and 64 64, respectively, from Micro RTS s official code base. |
| Dataset Splits | No | The paper does not provide specific details on traditional training, validation, and testing dataset splits, as it operates in a game-playing synthesis context rather than a fixed dataset learning setup. |
| Hardware Specification | Yes | All experiments were run on a single 2.4 GHz CPU with 8 GB of RAM and a time limit of 2 days. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | We use α = 0.9, β = 200, T1 = 100, ϵ = 1 with our SA implementation as these are the values Medeiros, Aleixo, and Lelis (2022) used in their experiments. We use the following exploration rates for the first phase of NS: ϵ1 0 = 0.7, ϵ1 l = 0.7, and ϵ1 g = 0.4; and the following for the second phase: ϵ1 0 = 0.1, ϵ1 l = 0.3, and ϵ1 g = 0.1. We use γ = 0.95. |