Towards Generalizable Reinforcement Learning for Trade Execution

Authors: Chuheng Zhang, Yitong Duan, Xiaoyu Chen, Jianyu Chen, Jian Li, Li Zhao

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on the high-fidelity simulator demonstrate that our algorithms can effectively alleviate overfitting and achieve better performance.
Researcher Affiliation Collaboration 1Microsoft Research 2IIIS, Tsinghua University
Pseudocode No The paper includes architectural diagrams (e.g., Figure 2) but no structured pseudocode or algorithm blocks.
Open Source Code Yes We provide the source code in https://github.com/zhangchuheng123/RL4Execution.
Open Datasets No Our simulator is based on the LOB data of 100 most liquid stocks in China A-share market. The data collected from April 2022 to June 2022 is used as the training set, and the data collected during July 2022 and August 2022 are used as the validation and testing set respectively. No specific link, DOI, or citation for public access to this dataset is provided, indicating it was collected by the authors.
Dataset Splits Yes The data collected from April 2022 to June 2022 is used as the training set, and the data collected during July 2022 and August 2022 are used as the validation and testing set respectively.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory.
Software Dependencies No The paper mentions using DDPG, DQN, and PPO as base RL algorithms, but does not provide specific version numbers for these libraries or any other software dependencies.
Experiment Setup Yes In the simplified task, 'we use DDPG [Silver et al., 2014] as the base RL algorithm'. For the high-fidelity simulation, the 'task is to sell 0.5% of the total trading volume... in a 30-minute period randomly selected from a trading day. The agent makes a decision... at the start of each minute.' The paper also describes the loss functions for CASH and CATE. 'The loss is a mean-squared error between the context representation and the generated statistics.' For CATE, the loss function L(θ, ϑ, w) is provided.