Continuous-Time Attention for Sequential Learning
Authors: Jen-Tzung Chien, Yi-Hsiang Chen7116-7124
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments on irregular sequence samples from human activities, dialogue sentences and medical features show the merits of the proposed continuous-time attention for activity recognition, sentiment classification and mortality prediction, respectively. |
| Researcher Affiliation | Academia | Jen-Tzung Chien, Yi-Hsiang Chen Department of Electrical and Computer Engineering National Chaio Tung University, Hsinchu, Taiwan {jtchien, ethernet420.eed08g}@nctu.edu.tw |
| Pseudocode | Yes | Algorithm 1: Attentive neural differential equation |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Human activity dataset (Kaluza et al. 2010) was used as an action recognition task; Multimodal Emotion Lines Dataset (MELD) (Poria et al. 2019) contained the dialogue instances collected from Friends TV series; Physio Net (Silva et al. 2012) was collected from the intensive care unit (ICU). |
| Dataset Splits | No | The paper does not explicitly state specific training, validation, and test dataset splits or percentages, nor does it refer to predefined splits with citations for reproducibility. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU models, CPU types, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions software components like 'Adamax' and 'Glove embedding' but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | Number of training epoch was 200. Learning rate was initialized by 0.01 and decayed after each iteration by multiplying 0.999. Adamax (Kingma and Ba 2014) was used. Hidden state size was 15. Relative and absolute tolerances were 1e-3 and 1e-4 for solver, respectively. A six-layer fully-connected network was configured as ODE function. One-layer GRU was used as RNN cell. Classifier was built by three-layer fully-connected network. |