Heterogeneous Interactive Snapshot Network for Review-Enhanced Stock Profiling and Recommendation

Authors: Heyuan Wang, Tengjiao Wang, Shun Li, Shijie Guan, Jiayi Zheng, Wei Chen

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on English and Chinese datasets spanning various stock exchange markets to verify HISN’s applicability. The cumulative and risk-adjusted returns outperform state-of-the-arts by over 7.6% and 10.2%.
Researcher Affiliation Academia 1School of Computer Science, National Engineering Laboratory for Big Data Analysis and Applications, Peking University, China 2University of International Relations 3Institute of Computational Social Science, Peking University(Qingdao)
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We validate HISN on two real-world stock forecast datasets: US S&P 500 [Xu and Cohen, 2018] contains 109,915 English Twitter data between Jan. 2014 and Jan. 2016... Ashare&HK [Huang et al., 2018] collects 90,361 news headlines from major financial websites in Chinese during Jan. and Dec. 2015.
Dataset Splits Yes Samples of the first 19 months are split for training, those of the last 3 months for testing, and the rest for validation in chronological order... Samples are chronologically divided, leaving us with a date range of the first 8 months for training, the last 3 months for testing and the others for validation.
Hardware Specification Yes Parameters are tuned using Adam optimizer [Kingma and Ba, 2015] on a Ge Force RTX 3090 GPU for 50 epochs, the batch size is 16 and learning rate is 1e-3.
Software Dependencies No The paper mentions software components such as "Adam optimizer", "BERT", "LDA", and "word2vec", but does not provide specific version numbers for any of them. It also does not explicitly mention the programming language or other key libraries with versions.
Experiment Setup Yes We keep the same setting as in previous works [Sawhney et al., 2021b] and leverage a consecutive 5-day lookback trading window (i.e., 5 daily snapshots) to generate each sample. To build heterogeneous document graphs, we set the number of topic nodes | O|, the per-text related topics o and the similarity threshold between entities δ to {15, 2, 0.5}... The word embeddings to initialize graph nodes are 300-dimensional. We set L = 2 graph layers and M = 4 attention heads in HAP. The hidden state size of twins-GRU is 64. The loss weighing factor λ = 4. We apply dropout [Srivastava et al., 2014] with the ratio of 0.3 at the end of each layer to mitigate overfitting. Parameters are tuned using Adam optimizer [Kingma and Ba, 2015]... for 50 epochs, the batch size is 16 and learning rate is 1e-3. Each experiment is repeated 5 times.