Quantum Multi-Agent Meta Reinforcement Learning

Authors: Won Joon Yun, Jihong Park, Joongheon Kim

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical Experiments The numerical experiments are conducted to investigate the following four aspects.
Researcher Affiliation Academia 1 School of Electrical Engineering, Korea University, Seoul, Republic of Korea 2 School of Information Technology, Deakin University, Geelong, VIC, Australia
Pseudocode Yes Algorithm 1: Training Procedure; Algorithm 2: Learning Procedure for Fast Remembering
Open Source Code No The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets Yes The experiment is conducted with the two-step game environment (Son et al. 2019), which is composed of discrete state spaces and discrete action spaces.... The two-step game scenario under two different environments, Env A and Env B as shown in Fig. 3(c)/(d)... a single-hop environment.
Dataset Splits No The paper mentions running simulations for a certain number of iterations and epochs (e.g., "3,000 and 20,000 iterations"), but it does not specify any training/validation/test dataset splits or their percentages/counts for reproducibility.
Hardware Specification No The paper describes the numerical experiments and simulations conducted but does not provide any specific details regarding the hardware used (e.g., CPU, GPU, memory, or specific computing infrastructure).
Software Dependencies No The paper does not specify any software dependencies or their version numbers that would be necessary to replicate the experiments (e.g., programming languages, libraries, or frameworks with versions).
Experiment Setup Yes The experiment is conducted with the two-step game environment... We conduct meta-QNN angle training and local-QNN pole training with 3, 000 and 20, 000 iterations for the simulation, respectively. The two agents pole parameters (i.e., θ1 and θ2) are trained in local-QNN pole training. We test under the angle noise bound α {0 , 30 , 45 , 60 , 90 }. We set the criterion of numeric convergence when the action-values given s1 and s3, stop increasing/decreasing.