reproducibilityindex.ai

Newton Optimization on Helmholtz Decomposition for Continuous Games

Authors: Giorgia Ramponi, Marcello Restelli11325-11333

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we empirically compare the NOHD s performance with state-of-the-art algorithms on some bimatrix games and in a continuous Gridworld environment. Finally, in Section 6, we analyze the empirical performance of NOHD when agents optimize a Boltzmann policy in three bimatrix games: Prisoner s Dilemma, Matching Pennies, and Rock-Paper-Scissors. In the last experiment, we study the learning performance of NOHD in two continuous gridworld environments. In all experiments, NOHD achieves great results conﬁrming the quadratic nature of the update.
Researcher Affiliation	Academia	Giorgia Ramponi, Marcello Restelli Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano Piazza Leonardo da Vinci, 32, 20133, Milano, Italy {giorgia.ramponi, marcello.restelli}@polimi.it
Pseudocode	Yes	Algorithm 1 NOHD
Open Source Code	No	The paper does not provide a specific link or explicit statement about the release of its source code for the described methodology.
Open Datasets	Yes	The ﬁrst gridworld is the continuous version of the second gridworld proposed in (Hu and Wellman 2003): the two agents are initialized in the two opposite lower corners and have to reach the same goal; when one of the two agents reaches the goal, the game ends, and this agent gets a positive reward.
Dataset Splits	No	The paper mentions experimental settings like initializations, number of runs, and sampling trajectories, but does not provide specific details on dataset splits (e.g., percentages or counts) for training, validation, or test sets.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running its experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments.
Experiment Setup	Yes	For each game, we perform experiments with learning rates 0.1, 0.5, 1.0. In Matching Pennies we initialize probabilities to [0.86, 0.14] for the ﬁrst agent and to [0.14, 0.86] for the second agent; instead in Rock Paper Scissors to [0.66, 0.24, 0.1]. We performed 20 runs for each setting. In each iteration, we sampled 300 trajectories of length 1. The agents policies are Gaussian policies, linear in a set of respectively 72 and 68 radial basis functions, which generate the ν angle for the step s direction.