Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

Authors: Tien Vu Nguyen, Dinh Phung, Xuanlong Nguyen, Swetha Venkatesh, Hung Bui

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains. (Abstract)
Researcher Affiliation Collaboration Vu Nguyen TVNGUYE@DEAKIN.EDU.AU Center for Pattern Recognition and Data Analytics (PRa DA), Deakin University, AustraliaDinh Phung DINH.PHUNG@DEAKIN.EDU.AU Center for Pattern Recognition and Data Analytics (PRa DA), Deakin University, AustraliaXuan Long Nguyen XUANLONG@UMICH.EDU Department of Statistics, University of Michigan, Ann Arbor, USASvetha Venkatesh SVETHA.VENKATESH@DEAKIN.EDU.AU Center for Pattern Recognition and Data Analytics (PRa DA), Deakin University, AustraliaHung Hai Bui BUI.H.HUNG@GMAIL.COM Laboratory for Natural Language Understanding, Nuance Communications, Sunnyvale, USA
Pseudocode No The paper describes the generative process and inference steps (collapsed Gibbs sampling) in paragraph form, but it does not include a distinct 'Pseudocode' or 'Algorithm' block or figure.
Open Source Code No The paper does not provide a direct link to its source code or explicitly state that the code for the described methodology is publicly released.
Open Datasets Yes For NUS-WIDE we use a subset of the 13-class animals comprising of 3,411 images (2,054 images for training and 1357 images for testing) with off-the-shelf features including 500-dim bag-of-word SIFT vector and 1000-dim bag-of-tag annotation vector. downloaded from http://www.ml-thu.net/ jun/data/
Dataset Splits Yes For NUS-WIDE... (2,054 images for training and 1357 images for testing)... We use NIPS and PNAS datasets with 90% for training and 10% for held-out perplexity evaluation.
Hardware Specification No The paper does not mention any specific hardware components (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions statistical models like 'Multinomial with Dirichlet prior' and 'Gaussian' and the 'collapsed Gibbs sampling' procedure, but it does not specify any software packages or libraries with version numbers.
Experiment Setup Yes We ran collapsed Gibbs for 500 iterations after 100 burn-in samples. (Section 4.2)