InfoCTM: A Mutual Information Maximization Perspective of Cross-Lingual Topic Modeling
Authors: Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Chaoqun Liu, Liang-Ming Pan, Anh Tuan Luu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on English, Chinese, and Japanese datasets demonstrate that our method outperforms state-of-the-art baselines, producing more coherent, diverse, and well-aligned topics and showing better transferability for cross-lingual classification tasks. |
| Researcher Affiliation | Collaboration | Xiaobao Wu1, Xinshuai Dong2, Thong Nguyen3, Chaoqun Liu1,4, Liang-Ming Pan3, Anh Tuan Luu1 1Nanyang Technological University, Singapore 2Carnegie Mellon University, USA 3National University of Singapore, Singapore 4DAMO Academy, Alibaba Group, Singapore |
| Pseudocode | No | The paper describes its methods in text and uses mathematical formulations but does not include any explicit pseudocode blocks or algorithms. |
| Open Source Code | Yes | 1Our code is available at https://github.com/bobxwu/Info CTM. |
| Open Datasets | Yes | We use the following benchmark datasets in our experiments: EC News is a collection of English and Chinese news (Wu et al. 2020a)... Amazon Review includes English and Chinese reviews from the Amazon website... Rakuten Amazon contains Japanese reviews from Rakuten (a Japanese online shopping website, Zhang and Le Cun 2017), and English reviews from Amazon (Yuan, Van Durme, and Ying 2018). |
| Dataset Splits | No | The paper mentions training and testing classifiers ('we train and test the classifier on the same language', 'train the classifier on one language and test it on another') but does not provide explicit details on train, validation, and test splits (e.g., percentages, sample counts, or specific split files). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU/GPU models, memory, or cloud service specifications). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | No | The paper mentions hyperparameters like τ (temperature hyper-parameter) and λTAMI (weight hyper-parameter) but does not provide their specific values or other concrete experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer settings. |