Learning Topic Models by Neighborhood Aggregation

Authors: Ryohei Hisano

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we show that our approach outperforms the state-of-the-art supervised Latent Dirichlet Allocation implementation in terms of held-out document classification tasks. We conduct experiments showing the validity of our approach. We use three datasets in our experiments.
Researcher Affiliation Academia Ryohei Hisano Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan em072010@yahoo.co.jp
Pseudocode No The paper describes equations and procedures in text but does not include any formal pseudocode blocks or algorithms.
Open Source Code No The paper does not provide an explicit statement or link to its own open-source code for the described methodology.
Open Datasets Yes The economic watcher survey, in the table abbreviated as EWS, is a dataset provided by the Cabinet Office of Japan 1. The whole dataset is available at http://www5.cao.go.jp/keizai3/ watcher index.html. Amazon review data are a dataset of gathered ratings and review information [Mc Auley et al., 2015]... The whole dataset is available at http://jmcauley.ucsd.edu/data/ amazon/. Subjectivity data are a dataset provided by [Pang and Lee, 2004]... The whole dataset is available at http://ws.cs.cornell.edu/ people/pabo/movie-review-data.
Dataset Splits Yes We randomly sample 5000 records for training, development, and testing. Parameters (e.g., the number of hidden units) of these models was found by utilizing the development dataset. We focus on snippets that have more than nine words and sample 1000 snippets each for training, development and testing6.
Hardware Specification No The paper does not specify any hardware details such as GPU/CPU models, memory, or specific computing environments used for the experiments.
Software Dependencies No The paper mentions software like 'word2vec vectors' and 'standard morphological analysis software' but does not provide specific version numbers for these or any other key software components.
Experiment Setup Yes For the regularization parameter governing WS and WC, we set it to 0.001, and for the output function, we set a dropout probability of 0.5 for regularization. We also set η in our model to be 0.2 for the economic watcher survey and 0.05 for the rest, and set the number of hidden units in Eq.(7) to be H1 = 50 and H2 = 50. We also fix the number of topics to 20 for all experiments performed in this section.