Generating Realistic Stock Market Order Streams

Authors: Junyi Li, Xintong Wang, Yaoyang Lin, Arunesh Sinha, Michael Wellman727-734

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment with synthetic and real market data. The synthetic data is produced using a stock market simulator... The real market data was obtained from One Market Data... We compare against other baseline generative models such as recurrent conditional variational auto-encoder (VAE) and DCGAN instead of WGAN within Stock-GAN. We perform an ablation study showing the usefulness of our generator structure design as elaborated above. Overall, Stock-GAN is able to best generate realistic data compared to the alternatives. and The paper has a dedicated section '4 Experimental Results' where it presents evaluations on datasets, compares to baselines, and performs ablation studies.
Researcher Affiliation Academia Junyi Li,1 University of Pittsburgh, Xintong Wang,2 University of Michigan, Yaoyang Lin,3 Harvard University, Arunesh Sinha,4 Singapore Management University, Michael P. Wellman2, University of Michigan
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes An appendix in the full version provides all additional results and code for our work.
Open Datasets No The synthetic data is produced using a stock market simulator that has been used in several agent-based financial studies (Wellman and Wah 2017), but is far from real market data. The real market data was obtained from One Market Data, a financial data provider. The paper mentions the source of the data but does not provide concrete access information (link, DOI, repository) for the processed datasets used in the experiments.
Dataset Splits No The paper mentions training on real data and using mini-batches, but does not provide specific details on training, validation, or test dataset splits (e.g., percentages or exact counts) for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using WGAN, LSTM, and convolutional layers, but does not specify any software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions, CUDA versions).
Experiment Setup Yes We choose k = 20. The history is condensed to one vector using a single LSTM layer. This vector and uniform noise of dimension 100 is fed to a fully connected layer followed by 4 convolution layers... The critic is trained 100 times in each iteration. The notable part in constructing the training data is that for each of 64 data points in a mini-batch...