Beyond Polarity: Interpretable Financial Sentiment Analysis with Hierarchical Query-driven Attention

Authors: Ling Luo, Xiang Ao, Feiyang Pan, Jin Wang, Tong Zhao, Ningzi Yu, Qing He

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on a real-world dataset. The results demonstrate that our framework can learn better representation of the document and unearth meaningful clues on replying different users preferences. It also outperforms the state-of-the-art methods on sentiment prediction of financial documents.
Researcher Affiliation Collaboration Ling Luo1,4, Xiang Ao1,4, Feiyang Pan1,4, Jin Wang2, Tong Zhao3, Ningzi Yu3 and Qing He1,4 1Key Lab of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2Computer Science Department, UCLA 3Deloitte China 4University of Chinese Academy of Sciences
Pseudocode No The paper describes the model architecture with diagrams and mathematical equations but does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to the source code for the methodology described in this paper. A GitHub link is provided, but it pertains to an existing baseline method (CNN-word), not the authors' own FISHQA model.
Open Datasets No Our dataset combines a collection of 30, 000 documents, which were extracted from various Chinese mainstream financial websites over 30 days, ranging from May 26 to June 25, 2017. Among them, 7, 648 documents were annotated by three domain experts in the perspective that whether the corresponding bonds of the companies mentioned in the document will encounter the risk of default in the future. If the judgment is yes, the document is labeled as negative, otherwise as non-negative. Such manually labeled documents form our experimental set. The paper describes the dataset but provides no access information (link, DOI, specific repository, or formal citation) for it to be publicly available.
Dataset Splits Yes We perform 5-fold cross-validation on the experimental dataset for all the methods.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using 'public NLP tools' and the Adam optimizer but does not provide specific names or version numbers for any software libraries, programming languages, or other ancillary software dependencies required for replication.
Experiment Setup Yes We set the dimension of word embedding as 200 and hidden size of GRU as 100. We optimize the training process using Adam [Kinga and Adam, 2015] with a mini-batch of size 64 and a learning rate 0.001. The number of words for each sentence and sentences for each document are set to be 45 and 30 by grid search, respectively.