Quantum Multi-Agent Meta Reinforcement Learning
Authors: Won Joon Yun, Jihong Park, Joongheon Kim
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical Experiments The numerical experiments are conducted to investigate the following four aspects. |
| Researcher Affiliation | Academia | 1 School of Electrical Engineering, Korea University, Seoul, Republic of Korea 2 School of Information Technology, Deakin University, Geelong, VIC, Australia |
| Pseudocode | Yes | Algorithm 1: Training Procedure; Algorithm 2: Learning Procedure for Fast Remembering |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | The experiment is conducted with the two-step game environment (Son et al. 2019), which is composed of discrete state spaces and discrete action spaces.... The two-step game scenario under two different environments, Env A and Env B as shown in Fig. 3(c)/(d)... a single-hop environment. |
| Dataset Splits | No | The paper mentions running simulations for a certain number of iterations and epochs (e.g., "3,000 and 20,000 iterations"), but it does not specify any training/validation/test dataset splits or their percentages/counts for reproducibility. |
| Hardware Specification | No | The paper describes the numerical experiments and simulations conducted but does not provide any specific details regarding the hardware used (e.g., CPU, GPU, memory, or specific computing infrastructure). |
| Software Dependencies | No | The paper does not specify any software dependencies or their version numbers that would be necessary to replicate the experiments (e.g., programming languages, libraries, or frameworks with versions). |
| Experiment Setup | Yes | The experiment is conducted with the two-step game environment... We conduct meta-QNN angle training and local-QNN pole training with 3, 000 and 20, 000 iterations for the simulation, respectively. The two agents pole parameters (i.e., θ1 and θ2) are trained in local-QNN pole training. We test under the angle noise bound α {0 , 30 , 45 , 60 , 90 }. We set the criterion of numeric convergence when the action-values given s1 and s3, stop increasing/decreasing. |