Adaptive Long-Short Pattern Transformer for Stock Investment Selection

Authors: Heyuan Wang, Tengjiao Wang, Shun Li, Jiayi Zheng, Shijie Guan, Wei Chen

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three exchange market datasets show ALSP-TF s superiority over state-of-the-art stock forecast methods.
Researcher Affiliation Academia Heyuan Wang1,3 , Tengjiao Wang1,3 , Shun Li2 , Jiayi Zheng1,3 , Shijie Guan1,3 and Wei Chen1,3 1School of Computer Science, National Engineering Laboratory for Big Data Analysis and Applications, Peking University, China 2University of International Relations 3Institute of Computational Social Science, Peking University (Qingdao)
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not mention providing open-source code for the methodology described.
Open Datasets Yes We examine ALSP-TF on three real-world datasets from US and Japanese Exchange markets. The first dataset [Feng et al., 2019b] comprises 1,026 shares from fairly volatile US S&P 500 and NASDAQ Composite Indexes; The second dataset [Feng et al., 2019b] targets at 1,737 stocks from NYSE, which is by far the world s largest stock exchange w.r.t. market capitalization of listed companies and is relatively stable compared to NASDAQ; The third dataset [Li et al., 2021] corresponds to the popular TOPIX-100 Index, which includes 95 stocks with the largest market capitalization in Tokyo stock exchange.
Dataset Splits Yes Table 2: Statistics of datasets. Train Period (Days), Val Period (Days), Test Period (Days)
Hardware Specification Yes We tune the model and ablation variants on a Ge Force RTX 3090 GPU
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for software dependencies.
Experiment Setup Yes We keep ρ = 0.85, 0.85, 0.90 for NASDAQ, NYSE and TSE respectively, and set the hop of graph convolutional operation to 2. For temporal modeling, we test stacking 1-5 Lit layers with varied skipping rates. The reported results utilize a 3-layers hierarchy assigning δ[1:3] to 1 2 3 and the number of attention heads H to 6 according to scores on validation set. The dimension of hidden feature space df is 16. The loss factors are set to α = 4 and η = 0.5. We tune the model and ablation variants on a Ge Force RTX 3090 GPU by Adam optimizer [Kingma and Ba, 2015] for 100 epochs, the learning rate is 1e-3 and batch size is 16.