Neural Topic Model via Optimal Transport
Authors: He Zhao, Dinh Phung, Viet Huynh, Trung Le, Wray Buntine
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our framework significantly outperforms the state-of-the-art NTMs on discovering more coherent and diverse topics and deriving better document representations for both regular and short texts. |
| Researcher Affiliation | Academia | He Zhao, Dinh Phung, Viet Huynh, Trung Le, Wray Buntine Department of Data Science and Artificial Intelligence, Faculty of Information Technology Monash University, Australia {ethan.zhao,dinh.phung,viet.huynh,trunglm,wray.buntine}@monash.edu |
| Pseudocode | Yes | Algorithm 1: Training algorithm for NSTM. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code, nor does it include a link to a code repository or mention code in supplementary materials. |
| Open Datasets | Yes | Our experiments are conducted on five widely-used benchmark text datasets, varying in different sizes, including 20 News Groups (20NG)2, Web Snippets (WS) (Phan et al., 2008), Tag My News (TMN) (Vitale et al., 2012)3, Reuters extracted from the Reuters-21578 dataset4, Reuters Corpus Volume 2 (RCV2) (Lewis et al., 2004)5. |
| Dataset Splits | No | The paper states 'With the default training/testing splits of the datasets', but it does not provide specific percentages, sample counts, or explicit details for training, validation, or test sets. |
| Hardware Specification | Yes | The three models run on a Titan RTX GPU with batch size 1,000. |
| Software Dependencies | No | The paper states 'NSTM is implemented on Tensor Flow' but does not provide a specific version number for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | For the encoder θ, to keep simplicity, we use a fully-connected neural network with one hidden layer of 200 units and Re LU as the activation function, followed by a dropout layer (rate=0.75) and a batch norm layer... In all the experiments, we fix α = 20 and ϵ = 0.07... The optimisation of NSTM is done by Adam (Kingma & Ba, 2015) with learning rate 0.001 and batch size 200 for maximally 50 iterations. |