Evolutionary Dynamics of Q-Learning over the Sequence Form
Authors: Fabio Panozzo, Nicola Gatti, Marcello Restelli
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Originally from the previous works on evolutionary game theory models form multi agent learning, we produce an experimental evaluation to show the accuracy of the model. |
| Researcher Affiliation | Academia | Fabio Panozzo and Nicola Gatti and Marcello Restelli Department of Electronics, Information and Bioengineering, Politecnico di Milano Piazza Leonardo da Vinci, 32 I-20133, Milan, Italy {panozzo.fabio,nicola.gatti,marcello.restelli}@polimi.it |
| Pseudocode | Yes | Algorithm 2 SFpure strategy(xi) and Algorithm 3 SFQ learning are present in the paper. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper discusses extensive form games and uses an example game (Figure 1), but does not provide concrete access information (link, DOI, repository, or formal citation) for a publicly available or open dataset used for training. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) needed to reproduce the data partitioning for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In our experimental setting τ is a linear increasing function of time starting from 0.0001 and ending to 0.5, α is exponential decreasing starting from 1 and ending to 0.2. The algorithm stops when the difference of expected utility between iterations n and n 1 of both agents is smaller than 0.001 for 1000 consecutive iterations. |