PRoFET: Predicting the Risk of Firms from Event Transcripts

Authors: Christoph Kilian Theil, Samuel Broscheit, Heiner Stuckenschmidt

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our proposed architecture, which models verbal context with an attention mechanism, significantly outperforms the previous state-of-the-art and other strong baselines. Finally, we visualize this attention mechanism on the token-level, thus aiding interpretability and providing a use case of PRo FET as a tool for investment decision support. Model. Our neural architecture, which jointly learns from semantic text representations and a comprehensive set of financial features, significantly outperforms the previous state-of-the-art and other baselines. In an ablation study, we further show that the joint model significantly outperforms models using either of both feature types alone and inspect the performance impact of different document sections. Data. We present a new dataset of 90K earnings call transcripts and address the task of text-based risk prediction at a large scale. Interpretability. The performance increases provided by neural models often come at the cost of interpretability. We address this issue by visualizing the predictive power of contextualized tokens with a heatmap. This demonstrates a use case of PRo FET as a tool for investment decision support. 5 Results and Discussion
Researcher Affiliation Academia Christoph Kilian Theil , Samuel Broscheit and Heiner Stuckenschmidt Data and Web Science Group, University of Mannheim, Germany {christoph, broscheit, heiner}@informatik.uni-mannheim.de
Pseudocode No The paper describes the architecture and optimization steps in detail but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our implementation (as elaborated below) can be found online.7 [Footnote 7: https://github.com/samuelbroscheit/neural-profet]
Open Datasets No We collect 90K earnings call transcripts from the database Thomson Reuters Eikon.2 ... Since the transcribed text data is intellectual property of Thomson Reuters, we are legally not allowed to share it in its raw form. However, our word embedding models and the financial data (as defined in Section 4) can be found online.3
Dataset Splits Yes To prevent look-ahead bias, we use a temporal 80/10/10 percentage split to divide the 90K instances into separate training, validation, and test sets. The training data spans from Jan. 2002 to Aug. 2015, validation from Aug. 2015 to Nov. 2016, and test from Nov. 2016 to Dec. 2017.
Hardware Specification Yes This research was supported by the NVIDIA Corporation, who donated a Titan X GPU.
Software Dependencies Yes The Bayesian optimization is implemented with sklearn 0.20.1 s Gaussian Process Regressor with RBF kernel and 20 restarts.
Experiment Setup Yes Optimization The performance of neural architectures is influenced by a range of hyperparameters. To choose a set of hyperparameters for our FNN, we explore: the number of hidden layers k {1, 2, 3}, hidden layer sizes n {128, 256, 512, 1024, 2048}, and whether to use batch normalization for layers lin = 0, 1 lhid < k and lout = k. For the Bi LSTM, we consider: the number of hidden layers k {1, 2, 3}, hidden layer sizes n {50, 100}, learning rate λ {10 1, 10 2}, dropout δ {0.0, 0.1, . . . , 0.5}, weight decay ω {10 4, 10 5, 10 6}, embedding size d {100, 200}, and whether the embeddings are adjusted. We train a model for up to 20 epochs with Adagrad [Duchi et al., 2010] and a batch size of 112.