Generalized Information-theoretic Multi-view Clustering

Authors: Weitian Huang, Sirui Yang, Hongmin Cai

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both synthetic and real datasets with wide types demonstrate that the proposed method exhibits more stable and superior clustering performance than state-of-the-art algorithms.
Researcher Affiliation Academia Weitian Huang School of Computer Science and Engineering South China University of Technology Guangzhou, 510006, China
Pseudocode Yes Algorithm 1 Optimization Procedure of IMC
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for their methodology is publicly available.
Open Datasets Yes Datasets: We adopt four real-world datasets listed as follows, (1) UCI-digits [25] contains 2000 examples of ten numerals from 0 to 9 with five views which are respectively extracted by Fourier coefficients, profile correlations, Karhunen-Love coefficient, Zernike moments, and pixel average extractors. (2) Notting-Hill [26] is widely used video face dataset for clustering, which collects 4660 faces across 76 tracks of the 5 main actors from the movie Notting Hill . We use the multi-view version provided in [27], consisting of 550 images with three kind of features, i.e., LBP, gray pixels, and Gabor features. (3) BDGP [28] contains 2500 images in 5 categories, and each sample is described by a 1750-D image vector and a 79-D textual feature vector. (4) Caltech20 is a subset of the object recognition dataset [35] containing 20 classes with six different views, including Gabor features, wavelet moments, CENTRIST features, histogram of oriented gradients, GIST features and local binary patterns.
Dataset Splits No The paper describes the datasets used but does not specify explicit training, validation, and test splits with percentages, sample counts, or references to predefined splits for reproduction.
Hardware Specification No The paper describes the neural network architectures and training parameters, but it does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions the use of 'Adam optimizer [24]' but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used in the implementation.
Experiment Setup Yes The architectures of pϕ(v)(z(v)|x(v)) and qθ(v)(x(v)|z) are fully connected networks with with dv-500-500-1204-256 and 256-1024-500-500-dv neurons, respectively... Adam optimizer [24] is utilized to maximize the objective function, and set the learning rate to be 0.001 with a decay of 0.9 for every 10 epochs. ...we choose the values of β and γ in the range of {0.01, 0.1, 1, 10, 100}.