reproducibilityindex.ai

Algorithms or Actions? A Study in Large-Scale Reinforcement Learning

Authors: Anderson Rocha Tavares, Sivasubramanian Anbalagan, Leandro Soriano Marcolino, Luiz Chaimowicz

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present synthetic experiments to further study such systems. Finally, we propose a function approximation approach, demonstrating the effectiveness of learning over algorithms in real-time strategy games.
Researcher Affiliation	Academia	Anderson Rocha Tavares 1, Sivasubramanian Anbalagan 2, Leandro Soriano Marcolino2, Luiz Chaimowicz1 1 Computer Science Department Universidade Federal de Minas Gerais 2 School of Computing and Communications Lancaster University
Pseudocode	No	The paper describes algorithms textually and mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code of synthetic and µRTS experiments are available at: https://github.com/andertavares/syntheticmdps and https://github.com/Siva Anbalagan1/microrts FA, respectively.
Open Datasets	No	In this paper we use µRTS, a simpliﬁed RTS game developed for AI research2. We used the map bases Workers24 24. No specific dataset is cited or linked for training, rather the interaction occurs within the game environment.
Dataset Splits	No	No explicit mention of training/validation/test splits, percentages, or sample counts for reproduction. The training is described in terms of games played against opponents rather than specific dataset splits.
Hardware Specification	No	No specific hardware details (like CPU/GPU models, memory) are provided for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	We used the map bases Workers24 24 , and the best parametrization we found: α = 10 4, γ = 0.9, ϵ exponentially decaying from 0.2 against Puppet AB, Puppet MCTS and AHTN; and decaying from 0.1 for Naive MCTS and Strategy Tactics, after every game (decay rate 0.9984). All games have 3000 cycles at most, declared a draw on timeout. Rewards are -1, 0 or 1 for defeat, draw and victory, respectively.