Manifold Learning for Jointly Modeling Topic and Visualization
Authors: Tuan Le, Hady Lauw
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on several real-life text datasets of news articles and web pages show that SEMAFORE significantly outperforms the state-of-the-art baselines on objective evaluation metrics. |
| Researcher Affiliation | Academia | Tuan M. V. Le and Hady W. Lauw School of Information Systems, Singapore Management University, 80 Stamford Road, Singapore 178902 {vmtle.2012@phdis.smu.edu.sg, hadywlauw@smu.edu.sg} |
| Pseudocode | No | The paper describes the generative process and model fitting steps using textual descriptions and mathematical equations, but it does not include a structured pseudocode block or algorithm. |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use three real-life, publicly available datasets1 for evaluation. 20News contains newsgroup articles (in English) from 20 classes. Reuters8 contains newswire articles (in English) from 8 classes. Cade12 contains web pages (in Brazilian Portuguese) classified into 12 classes. These are benchmark datasets frequently used for document classification. 1http://web.ist.utl.pt/acardoso/datasets/ |
| Dataset Splits | No | The paper mentions generating five samples for each dataset and using a sixth sample as a test set, but it does not provide specific details on training/validation splits within these samples (e.g., percentages or counts for a dedicated validation set). |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware used to conduct the experiments. |
| Software Dependencies | No | The paper mentions using a 'quasi-Newton' method for optimization and setting hyperparameters, but it does not list any specific software libraries, frameworks, or their version numbers. |
| Experiment Setup | Yes | We set the hyper-parameters to α = 0.01, β = 0.1N and γ = 0.1Z following (Iwata, Yamada, and Ueda 2008). When unvaried, the defaults are number of topics Z = 20, neighborhood size k = 10, and regularization R with λ = 1. |