A Neural Stochastic Volatility Model

Authors: Rui Luo, Weinan Zhang, Xiaojun Xu, Jun Wang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real-world stock price datasets demonstrate that the proposed model generates a better volatility estimation and prediction that outperforms mainstream methods, e.g., deterministic models such as GARCH and its variants, and stochastic models namely the MCMC-based stochvol as well as the Gaussian-processbased, on average negative log-likelihood.
Researcher Affiliation Academia University College London Shanghai Jiao Tong University {r.luo,j.wang}@cs.ucl.ac.uk, {wnzhang,xuxj}@apex.sjtu.edu.cn
Pseudocode Yes Algorithm 1 Recursive Forecasting
Open Source Code No The paper refers to author implementations and tools for baselines (stochvol1, GP-Vol2, and other packages for GARCH variants) but does not provide a link or explicit statement about the open-source availability of the code for their proposed NSVM model.
Open Datasets No The raw dataset comprises 162 univariate time series of the daily closing stock price, chosen from China s A-shares and collected from 3 institutions. No specific link, DOI, repository name, or formal citation is provided for public access to this dataset.
Dataset Splits Yes We divide the whole dataset into two subsets for training and testing along the time axis: the first 2000 time steps of each series have been used as training samples whereas the rest 570 steps of each series as the test samples.
Hardware Specification Yes We train the model on a single-GPU (Titan X Pascal) server for roughly two hours before it converges to a certain degree of accuracy on the training samples.
Software Dependencies Yes implement the models, such as GARCH, EGARCH, GJR-GARCH, etc., based on several widely-used packages345 for time series analysis. (Footnote 3: https://pypi.python.org/pypi/arch/4.0)
Experiment Setup Yes The NSVM implementation in our experiments is composed of two neural networks, namely the generative network (see Eq. (16)-(21)) and inference network (see Eq. (23)-(27)). Each RNN module contains one hidden layer of size 10 with GRU cells; MLP modules are 2-layered fully-connected feedforward networks, where the hidden layer is also of size 10 whereas the output layer splits into two equal-sized sublayers with different activation functions: one applies exponential function to ensure the non-negativity for variance while the other uses linear function to calculate mean estimates. Thus MLPz I s output layer is of size 4 + 4 for { μz, Σz} whereas the size of MLPx G s output layer is 6 + 6 for {μx, Σx}. During the training phase, the inference network is connected with the conditional generative network (see, Eq. (16)-(18)) to establish a bottleneck structure, the latent variable zt inferred by variational inference (Kingma and Welling 2013; Rezende, Mohamed, and Wierstra 2014) follows a Gaussian approximate posterior; the size of sample paths is set to S = 100. The parameters of both networks are jointly learned, including those for the prior. We introduce Dropout (Srivastava et, al. 2014) into each RNN modules and impose L2-norm on the weights of MLP modules as regularistion to prevent overshooting; Adam optimiser (Kingma and Ba 2014) is exploited for fast convergence; exponential learning rate decay is adopted to anneal the variations of convergence as time goes.