Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
CausalStock: Deep End-to-end Causal Discovery for News-driven Multi-stock Movement Prediction
Authors: Shuqi Li, Yuebo Sun, Yuxin Lin, Xin Gao, Shuo Shang, Rui Yan
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiment results show that Causal Stock outperforms the strong baselines for both news-driven multi-stock movement prediction and multi-stock movement prediction tasks on six real-world datasets collected from the US, China, Japan, and UK markets. |
| Researcher Affiliation | Academia | 1 Gaoling School of Artificial Intelligence, Renmin University of China 2 Peking University 3 King Abdullah University of Science and Technology 4 University of Electronic Science and Technology of China |
| Pseudocode | No | The paper describes the model and its components using mathematical equations and text, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Answer: [Yes] Justification: We release code and data in Git Hub. |
| Open Datasets | Yes | Dataset (Appendix C.1): We train and evaluate our model and baselines on six datasets: ACL18 [42], CMIN-US [23], CMIN-CN [23], KDD17 [45], NI225 [44], and FTSE100 [44]. |
| Dataset Splits | Yes | Table 3: Dataset Description ... ACL18 ... 2014/01/02-2015/08/02 (Train) 2015/08/03-2015/09/30 (Valid) 2015/10/01-2016/01/01 (Test) |
| Hardware Specification | Yes | Our model is implemented with Pytorch on 4 NVIDIA Tesla V100 and optimized by Adam [20]. |
| Software Dependencies | No | The paper mentions "Pytorch" as the implementation framework but does not specify a version number or other software dependencies with version numbers. |
| Experiment Setup | Yes | The learning rate is set as 1e 5 selected from [1e 3, 1e 4, 1e 5, 1e 6]. The time lag L is set as 5 selected from [3, 5, 7, 9]. We select the price encoder hidden size from [4, 8, 16] and get the best performance with size 4. The batch size is set as 32. The scalar weight λ is set to 0.01. For the traditional news encoder, the maximum word number in one piece of news and news number in one day are set to w = 20, l = 10, respectively. The embedding size of word and news are set to dw = 50, dm = 64, respectively. For the Lag-dependent temporal causal discovery module, λs = 1, hv and hu are all 1-layer MLPs. For the FCM part, the neural modules ζi, ℓand ψ are all 3-layer MLPs with hidden size 332. |