Robust Bayesian Max-Margin Clustering

Authors: Changyou Chen, Jun Zhu, Xinhua Zhang

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are performed on a number of real datasets, and the results indicate superior clustering performance of our methods compared to related baselines.
Researcher Affiliation Collaboration Dept. of Electrical and Computer Engineering, Duke University, Durham, NC, USA State Key Lab of Intelligent Technology & Systems; Tsinghua National TNList Lab; Dept. of Computer Science & Tech., Tsinghua University, Beijing 100084, China Australian National University (ANU) and National ICT Australia (NICTA), Canberra, Australia
Pseudocode No No explicit pseudocode or algorithm blocks are provided in the main text.
Open Source Code No The paper does not provide any link or explicit statement about the availability of open-source code for the methodology.
Open Datasets Yes We test the MMCTM model on two document datasets: 20NEWS and Reuters-R8 . For the 20NEWS dataset, we combine the training and test datasets used in [16], which ends up with 20 categories/clusters with roughly balanced cluster sizes. It contains 18,772 documents in total with a vocabulary size of 61,188. The Reuters-R8 dataset is a subset of the Reuters-21578 dataset4, with of 8 categories and 7,674 documents in total. The size of different categories is biased, with the lowest number of documents in a category being 51 while the highest being 2,292.
Dataset Splits Yes We choose L {5, 10, 15, 20, 25} documents randomly from each category as the landmarks, use 80% documents for training and the rest for testing.
Hardware Specification No The paper does not specify any hardware used for running the experiments.
Software Dependencies No The paper mentions software like KMeans, n Cut, DPGMM, SVM, S3VM, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We set v = 0.01, c = 0.1, ℓ= 5 in this experiment. Note that the clustering structure is sensitive to the values of c and ℓ, which will be studied below. ... We set the number of topics (i.e., T) to 50, and set the Dirichlet prior in Section 5 to ω = 0.1, β = 0.01, α = α0 = α1 = 10, as clustering quality is not sensitive to them. For the other hyperparameters related to the max-margin constraints, e.g., v in the Gaussian prior for η, the balance parameter c, and the cost parameter ℓ, instead of doing cross validation which is computationally expensive and not helpful for our scenario with few labeled data, we simply set v = 0.1, c = 9, ℓ= 0.1.