MAPS: Multi-Agent reinforcement learning-based Portfolio management System.

Authors: Jinho Lee, Raehyun Kim, Seok-Won Yi, Jaewoo Kang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results with 12 years of US market data show that MAPS outperforms most of the baselines in terms of Sharpe ratio.
Researcher Affiliation Academia Jinho Lee , Raehyun Kim , Seok-Won Yi and Jaewoo Kang Department of Computer Science and Engineering, Korea University {jinholee, raehyun, seanswyi, kangj}@korea.ac.kr
Pseudocode Yes Algorithm 1 Training algorithm
Open Source Code No No explicit statement or link to open-source code for the methodology is provided in the paper.
Open Datasets Yes We divided our dataset into training set validation set and test set. Detailed statistics of our dataset are summarized in Table 1. The validation set is used to optimize the hyperparameters. Period N #Data Training 2000-2004 1534 1876082 Validation 2004-2006 1651 779272 Test 2006-2018 2061 6019248
Dataset Splits Yes We divided our dataset into training set validation set and test set. Detailed statistics of our dataset are summarized in Table 1. The validation set is used to optimize the hyperparameters. Period N #Data Training 2000-2004 1534 1876082 Validation 2004-2006 1651 779272 Test 2006-2018 2061 6019248
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions general setup like 'All asset values are set to 100 at the beginning of the test period'.
Software Dependencies No The paper mentions using 'Adam optimizer [Kingma and Ba, 2014]' and 'Batch normalization [Ioffe and Szegedy, 2015]' but does not provide specific version numbers for these or other software libraries/dependencies.
Experiment Setup Yes The value of maxiter, β and C are 400,000, 128, and 1000, respectively. Batch normalization [Ioffe and Szegedy, 2015] is used after every layer except the final layer and Adam optimizer [Kingma and Ba, 2014] was used with a learning rate of 0.00001 to train our models. The value of λ was empirically chosen as 0.8 based on the validation set.