Integrating Image Clustering and Codebook Learning
Authors: Pengtao Xie, Eric P. Xing
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two datasets demonstrate the effectiveness of two models. In this section, we evaluate the effectiveness of DLGMM and SC-DLGMM models by comparing them with four baseline methods on image clustering task. Experimental Settings The experiments are conducted on 15-Scenes (Lazebnik, Schmid, and Ponce 2007) dataset and Caltech-101 (Fei-Fei, Fergus, and Perona 2004) dataset. |
| Researcher Affiliation | Academia | Pengtao Xie and Eric Xing {pengtaox,epxing}@cs.cmu.edu School of Computer Science, Carnegie Mellon University 5000 Forbes Ave, Pittsburgh, PA 15213 |
| Pseudocode | No | The paper describes processes using textual steps and mathematical equations, but does not provide a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any explicit statement about open-source code availability or links to a code repository. |
| Open Datasets | Yes | The experiments are conducted on 15-Scenes (Lazebnik, Schmid, and Ponce 2007) dataset and Caltech-101 (Fei-Fei, Fergus, and Perona 2004) dataset. |
| Dataset Splits | No | The paper mentions using subsets of datasets (e.g., 'randomly choose half images' for Caltech-101) and varying codebook size, but does not explicitly provide details on train/validation/test splits with percentages or counts, or reference predefined splits with citations. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions methods like SIFT, K-means, Normalized Cut, JSOM, and LDA, but does not specify any software names with version numbers or programming language versions required to replicate the experiment. |
| Experiment Setup | Yes | Our models are initialized with the clustering results obtained from LDA. We compare these methods under varying codebook size ranging from 100 to 1000 with an increment of 100. The required input cluster number in KM, NC and our models is set to the ground truth number of categories in datasets. In NC, we use Gaussian kernel as similarity measure between images. The bandwidth parameter is set to 1. In JSOM, topic number is set to 100. In LDA, symmetric Dirichlet priors are used and are set to 0.05. In SC-DLGMM, parameter γ on the MRF is tuned to produce the best possible clustering performance. |