Modeling the Intensity Function of Point Process Via Recurrent Neural Networks

Authors: Shuai Xiao, Junchi Yan, Xiaokang Yang, Hongyuan Zha, Stephen Chu

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our model to the predictive maintenance problem using a log dataset by more than 1000 ATMs from a global bank headquartered in North America. ... We use failure prediction for predictive ATMs maintenance as a typical example of event based point process modeling. ... Table 3 shows the averaged performance among various types of events. ... Confusion matrix The confusion matrix for the six subtypes under error event, as well as for the two main types ticket and error are shown in Fig.4 by various methods.
Researcher Affiliation Collaboration 1 Shanghai Jiao Tong University 2 East China Normal University 3 IBM Research China 4 Georgia Tech
Pseudocode No No pseudocode or algorithm blocks are explicitly labeled or formatted as such in the paper.
Open Source Code No The paper mentions 'The code is based on Theano running on a Linux server...', but it does not provide any concrete access (link, explicit release statement) to the source code for the methodology described in the paper.
Open Datasets No The studied dataset is comprised of the event logs involving error reporting and failure tickets, which is originally collected from a large number of ATMs owned by an anonymous global bank headquartered in North America.
Dataset Splits No The training data consists of 1085 ATMs and testing data has 469 ATMs, in total 1557 Wincor ATMs that cover 5 ATM machine models: Pro Cash 2100 RL (980, 430), 1500 RL (19, 5), 2100 FL (53, 21), 1500 FL (26, 10), and 2250XE RL (7, 3). The numbers in the bracket indicate the number of machines for training and testing. However, there is no explicit mention of a validation set or its split.
Hardware Specification Yes The code is based on Theano running on a Linux server with 32G memory, 2 CPUs with 6 cores for each: Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz. We also use 4 GPU:Ge Force GTX TITAN X for acculturation.
Software Dependencies No The paper states 'The code is based on Theano' but does not provide specific version numbers for Theano or any other software dependencies.
Experiment Setup Yes We use a single layer LSTM of size 32 with Sigmoid gate activations and tanh activation for hidden representation. The embedding layer is fully connected and it uses tanh activation and outputs a 16 dimensional vector. ... For time series RNN, we set the length of each sub-window (i.e. the evenly spaced time interval) to be 7 days and the number of sub-window to be 5. In this way, our observation length is 35 days for time series. For event-dependency, the length of event sequence can be arbitrarily long. Here we take it by 7. ... We adopt RMSprop gradients (Dauphin et al. 2015) which have been shown to work well on training deep networks to learn these parameters.