Can You Improve My Code? Optimizing Programs with Local Search
Authors: Fatemeh Abdollahi, Saqib Ameen, Matthew E. Taylor, Levi H. S. Lelis
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | POLIS was evaluated with a 27-person user study, where participants wrote programs attempting to maximize the score of two single-agent games: Lunar Lander and Highway. POLIS was able to substantially improve the participants programs with respect to the game scores. |
| Researcher Affiliation | Academia | Fatemeh Abdollahi , Saqib Ameen , Matthew E. Taylor and Levi H. S. Lelis Department of Computing Science, University of Alberta, Canada Alberta Machine Intelligence Institute (Amii) {fabdolla, saqib.ameen, matthew.e.taylor, levi.lelis}@ualberta.ca |
| Pseudocode | Yes | The pseudocode in Algorithm 1 shows the local search algorithm POLIS employs. It receives an existing program p and two time limits, t and tl, for the overall running time of the search and for the running time allowed to optimize each line of code, respectively, and an evaluation function F. |
| Open Source Code | Yes | Our POLIS implementation and the data collected in our user study is available at https://github.com/Fatemeh AB/POLIS. |
| Open Datasets | Yes | Our POLIS implementation and the data collected in our user study is available at https://github.com/Fatemeh AB/POLIS. ... For the task of writing programmatic policies for playing games, we use the approach introduced by Verma et al. [2018b] to define a set of input-output examples. That is, we train a neural policy that generates a set of input-output pairs... |
| Dataset Splits | No | The paper describes the generation of input-output examples for training neural policies and the evaluation of programmatic policies in game environments, but it does not explicitly specify traditional train/validation/test dataset splits or a dedicated validation set for hyperparameter tuning. |
| Hardware Specification | No | The paper mentions 'computational resources from Compute Canada' and the 'Intelligent Robot Learning (IRL) Lab at the University of Alberta' but does not provide specific hardware details like CPU or GPU models, memory, or other specifications. |
| Software Dependencies | No | The paper mentions software components like 'Open AI Gym' and 'DQN' but does not specify exact version numbers for these or any other software libraries or programming languages used in the experiments. |
| Experiment Setup | Yes | We use DQN [Mnih et al., 2015] to train a neural policy π for 2000 episodes. We let the agent follow π in the environment for 2000 steps... We use k = 20 in our experiments. We performed 5 restarts for each run of the system; the result of a run is the best program encountered across the 5 restarts. The game score of both the participants and POLIS s programs is an average of the score the program obtained in 100 of Lunar Lander and 25 episodes of Highway. |