Contrastive Learning and Reward Smoothing for Deep Portfolio Management
Authors: Yun-Hsuan Lien, Yuan-Kui Li, Yu-Shuen Wang
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We tested our method against various traditional financial techniques and other deep RL methods, and found it to be effective in both the U.S. stock market and the cryptocurrency market. |
| Researcher Affiliation | Academia | National Yang Ming Chiao Tung University, Taiwan |
| Pseudocode | No | The paper describes the policy gradient method and implementation details but does not include any pseudocode blocks or algorithms. |
| Open Source Code | Yes | Our source code is available at https://github.com/sophialien/FinTechDPM. |
| Open Datasets | No | The data was obtained from the Yahoo Finance Application Programming Interface (API) and the Poloniex s official API, respectively. The paper describes the source of the data but does not provide a direct link, DOI, or formal citation for accessing a publicly available dataset itself. |
| Dataset Splits | No | The paper explicitly describes the test set split ('the last year s data were used for testing', 'the last two months of data were used for testing') but does not explicitly mention a separate validation dataset split. |
| Hardware Specification | No | The paper does not mention any specific hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications). |
| Software Dependencies | No | The paper mentions 'Adam W optimizer' but does not provide specific version numbers for it or other software libraries/frameworks like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The discount factor γ is set to 1, the length of reward smoothing F (Equation 12) is set to 5, and the temperature τ is set to 0.05 (Equation 6). The learning rate is set to 0.0001/0.00015 and the batch size is set to 300/500 for the U.S. stock/cryptocurrency market, respectively. The value of γ is set to 5e 5 for the cryptocurrency market and 5e 4 for the U.S. stock market. We set the NRI graph to contain 25 nodes, and divide the asset states into 20n groups. The learning rate is set to 0.00015 for the cryptocurrency market and 0.0001 for the U.S. stock market, and the temperature ν is set to 0.5. |