Generating Realistic Stock Market Order Streams
Authors: Junyi Li, Xintong Wang, Yaoyang Lin, Arunesh Sinha, Michael Wellman727-734
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment with synthetic and real market data. The synthetic data is produced using a stock market simulator... The real market data was obtained from One Market Data... We compare against other baseline generative models such as recurrent conditional variational auto-encoder (VAE) and DCGAN instead of WGAN within Stock-GAN. We perform an ablation study showing the usefulness of our generator structure design as elaborated above. Overall, Stock-GAN is able to best generate realistic data compared to the alternatives. and The paper has a dedicated section '4 Experimental Results' where it presents evaluations on datasets, compares to baselines, and performs ablation studies. |
| Researcher Affiliation | Academia | Junyi Li,1 University of Pittsburgh, Xintong Wang,2 University of Michigan, Yaoyang Lin,3 Harvard University, Arunesh Sinha,4 Singapore Management University, Michael P. Wellman2, University of Michigan |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | An appendix in the full version provides all additional results and code for our work. |
| Open Datasets | No | The synthetic data is produced using a stock market simulator that has been used in several agent-based financial studies (Wellman and Wah 2017), but is far from real market data. The real market data was obtained from One Market Data, a financial data provider. The paper mentions the source of the data but does not provide concrete access information (link, DOI, repository) for the processed datasets used in the experiments. |
| Dataset Splits | No | The paper mentions training on real data and using mini-batches, but does not provide specific details on training, validation, or test dataset splits (e.g., percentages or exact counts) for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using WGAN, LSTM, and convolutional layers, but does not specify any software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions, CUDA versions). |
| Experiment Setup | Yes | We choose k = 20. The history is condensed to one vector using a single LSTM layer. This vector and uniform noise of dimension 100 is fed to a fully connected layer followed by 4 convolution layers... The critic is trained 100 times in each iteration. The notable part in constructing the training data is that for each of 64 data points in a mini-batch... |