Explainable and Discourse Topic-aware Neural Language Understanding

Authors: Yatin Chaudhary, Hinrich Schuetze, Pankaj Gupta

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments over a range of tasks such as language modeling, word sense disambiguation, document classification, retrieval and text generation demonstrate ability of the proposed model in improving language understanding.
Researcher Affiliation Collaboration 1Corporate Technology, Machine Intelligence (MIC-DE), Siemens AG, Munich, Germany 2CIS, University of Munich (LMU), Munich, Germany.
Pseudocode Yes Algorithm 1 Computation of combined loss L; Algorithm 2 Utility functions
Open Source Code Yes Implementation of NCLM is available at: https://github.com/ Yatin Chaudhary/NCLM.
Open Datasets Yes We present experimental results of language modeling using our proposed models on APNEWS, IMDB and BNC datasets (Lau et al., 2017). We use three labeled datasets: 20Newsgroups (20NS), Reuters (R21578) and IMDB movie reviews (IMDB) (See supplementary for data statistics).
Dataset Splits No For data statistics and time complexity of experiments refer supplementary. Experimental setup: We follow Wang et al. (2018) for our experimental setup. See supplementary for detailed hyperparameter settings.
Hardware Specification No No explicit hardware specifications (e.g., specific GPU/CPU models, memory details) used for running experiments were provided in the main text.
Software Dependencies No No specific software dependencies with version numbers (e.g., libraries, frameworks, programming language versions) were provided in the main text.
Experiment Setup Yes We fix the NLM sequence length to 30 and bigger sentences are split into multiple sequences of length less than 30. We initialize the input word embeddings for NLM with 300-dimensional pretrained embeddings extracted from word2vec (Mikolov et al., 2013) model trained on Google News. Models are trained using a learning rate of 1e-3 & batch size of 32.