reproducibilityindex.ai

Towards Generalizable Reinforcement Learning for Trade Execution

Authors: Chuheng Zhang, Yitong Duan, Xiaoyu Chen, Jianyu Chen, Jian Li, Li Zhao

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on the high-fidelity simulator demonstrate that our algorithms can effectively alleviate overfitting and achieve better performance.
Researcher Affiliation	Collaboration	1Microsoft Research 2IIIS, Tsinghua University
Pseudocode	No	The paper includes architectural diagrams (e.g., Figure 2) but no structured pseudocode or algorithm blocks.
Open Source Code	Yes	We provide the source code in https://github.com/zhangchuheng123/RL4Execution.
Open Datasets	No	Our simulator is based on the LOB data of 100 most liquid stocks in China A-share market. The data collected from April 2022 to June 2022 is used as the training set, and the data collected during July 2022 and August 2022 are used as the validation and testing set respectively. No specific link, DOI, or citation for public access to this dataset is provided, indicating it was collected by the authors.
Dataset Splits	Yes	The data collected from April 2022 to June 2022 is used as the training set, and the data collected during July 2022 and August 2022 are used as the validation and testing set respectively.
Hardware Specification	No	The paper does not provide specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions using DDPG, DQN, and PPO as base RL algorithms, but does not provide specific version numbers for these libraries or any other software dependencies.
Experiment Setup	Yes	In the simplified task, 'we use DDPG [Silver et al., 2014] as the base RL algorithm'. For the high-fidelity simulation, the 'task is to sell 0.5% of the total trading volume... in a 30-minute period randomly selected from a trading day. The agent makes a decision... at the start of each minute.' The paper also describes the loss functions for CASH and CATE. 'The loss is a mean-squared error between the context representation and the generated statistics.' For CATE, the loss function L(θ, ϑ, w) is provided.