Message Passing Attention Networks for Document Understanding

Authors: Giannis Nikolentzos, Antoine Tixier, Michalis Vazirgiannis8544-8551

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments conducted on 10 standard text classification datasets show that our architectures are competitive with the state-of-the-art. Ablation studies reveal further insights about the impact of the different components on performance.
Researcher Affiliation Academia Giannis Nikolentzos,1 Antoine J.-P. Tixier,1 Michalis Vazirgiannis1,2 1 Ecole Polytechnique 2Athens University of Economics and Business
Pseudocode No The paper describes methods using mathematical equations (e.g., Eq. 1, 2, 4, 5, 7, 8) and prose, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code is publicly available at: https://github.com/giannisnik/mpad.
Open Datasets Yes We evaluate the quality of the document embeddings learned by MPAD on 10 document classification datasets... (1) Reuters contains stories from the Reuters news agency... (2) BBCSport (Greene and Cunningham 2006) contains sports news articles...
Dataset Splits Yes When cross-validation is used (see 3rd column of Table 1), we construct a validation set by randomly sampling 10% of the training set of each fold.
Hardware Specification Yes All experiments were run on a single machine consisting of a 3.4 GHz Intel Core i7 CPU with 16 GB of RAM and an NVidia Ge Force Titan Xp GPU.
Software Dependencies No MPAD was implemented in Python 3.6 using the Py Torch library.
Experiment Setup Yes We use two MP iterations (T=2) for the basic MPAD... We set d to 64, except on IMDB and Yelp on which d = 128, and use a two-layer MLP... minimizing the crossentropy loss function with the Adam optimizer... initial learning rate of 0.001... batch normalization... dropout (Srivastava et al. 2014) with a rate of 0.5. We select the best epoch, capped at 200...