PRoFET: Predicting the Risk of Firms from Event Transcripts
Authors: Christoph Kilian Theil, Samuel Broscheit, Heiner Stuckenschmidt
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our proposed architecture, which models verbal context with an attention mechanism, significantly outperforms the previous state-of-the-art and other strong baselines. Finally, we visualize this attention mechanism on the token-level, thus aiding interpretability and providing a use case of PRo FET as a tool for investment decision support. Model. Our neural architecture, which jointly learns from semantic text representations and a comprehensive set of financial features, significantly outperforms the previous state-of-the-art and other baselines. In an ablation study, we further show that the joint model significantly outperforms models using either of both feature types alone and inspect the performance impact of different document sections. Data. We present a new dataset of 90K earnings call transcripts and address the task of text-based risk prediction at a large scale. Interpretability. The performance increases provided by neural models often come at the cost of interpretability. We address this issue by visualizing the predictive power of contextualized tokens with a heatmap. This demonstrates a use case of PRo FET as a tool for investment decision support. 5 Results and Discussion |
| Researcher Affiliation | Academia | Christoph Kilian Theil , Samuel Broscheit and Heiner Stuckenschmidt Data and Web Science Group, University of Mannheim, Germany {christoph, broscheit, heiner}@informatik.uni-mannheim.de |
| Pseudocode | No | The paper describes the architecture and optimization steps in detail but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation (as elaborated below) can be found online.7 [Footnote 7: https://github.com/samuelbroscheit/neural-profet] |
| Open Datasets | No | We collect 90K earnings call transcripts from the database Thomson Reuters Eikon.2 ... Since the transcribed text data is intellectual property of Thomson Reuters, we are legally not allowed to share it in its raw form. However, our word embedding models and the financial data (as defined in Section 4) can be found online.3 |
| Dataset Splits | Yes | To prevent look-ahead bias, we use a temporal 80/10/10 percentage split to divide the 90K instances into separate training, validation, and test sets. The training data spans from Jan. 2002 to Aug. 2015, validation from Aug. 2015 to Nov. 2016, and test from Nov. 2016 to Dec. 2017. |
| Hardware Specification | Yes | This research was supported by the NVIDIA Corporation, who donated a Titan X GPU. |
| Software Dependencies | Yes | The Bayesian optimization is implemented with sklearn 0.20.1 s Gaussian Process Regressor with RBF kernel and 20 restarts. |
| Experiment Setup | Yes | Optimization The performance of neural architectures is influenced by a range of hyperparameters. To choose a set of hyperparameters for our FNN, we explore: the number of hidden layers k {1, 2, 3}, hidden layer sizes n {128, 256, 512, 1024, 2048}, and whether to use batch normalization for layers lin = 0, 1 lhid < k and lout = k. For the Bi LSTM, we consider: the number of hidden layers k {1, 2, 3}, hidden layer sizes n {50, 100}, learning rate λ {10 1, 10 2}, dropout δ {0.0, 0.1, . . . , 0.5}, weight decay ω {10 4, 10 5, 10 6}, embedding size d {100, 200}, and whether the embeddings are adjusted. We train a model for up to 20 epochs with Adagrad [Duchi et al., 2010] and a batch size of 112. |