InfoCTM: A Mutual Information Maximization Perspective of Cross-Lingual Topic Modeling

Authors: Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Chaoqun Liu, Liang-Ming Pan, Anh Tuan Luu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on English, Chinese, and Japanese datasets demonstrate that our method outperforms state-of-the-art baselines, producing more coherent, diverse, and well-aligned topics and showing better transferability for cross-lingual classification tasks.
Researcher Affiliation Collaboration Xiaobao Wu1, Xinshuai Dong2, Thong Nguyen3, Chaoqun Liu1,4, Liang-Ming Pan3, Anh Tuan Luu1 1Nanyang Technological University, Singapore 2Carnegie Mellon University, USA 3National University of Singapore, Singapore 4DAMO Academy, Alibaba Group, Singapore
Pseudocode No The paper describes its methods in text and uses mathematical formulations but does not include any explicit pseudocode blocks or algorithms.
Open Source Code Yes 1Our code is available at https://github.com/bobxwu/Info CTM.
Open Datasets Yes We use the following benchmark datasets in our experiments: EC News is a collection of English and Chinese news (Wu et al. 2020a)... Amazon Review includes English and Chinese reviews from the Amazon website... Rakuten Amazon contains Japanese reviews from Rakuten (a Japanese online shopping website, Zhang and Le Cun 2017), and English reviews from Amazon (Yuan, Van Durme, and Ying 2018).
Dataset Splits No The paper mentions training and testing classifiers ('we train and test the classifier on the same language', 'train the classifier on one language and test it on another') but does not provide explicit details on train, validation, and test splits (e.g., percentages, sample counts, or specific split files).
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU/GPU models, memory, or cloud service specifications).
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup No The paper mentions hyperparameters like τ (temperature hyper-parameter) and λTAMI (weight hyper-parameter) but does not provide their specific values or other concrete experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer settings.