Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering

Authors: Bo Yang, Xiao Fu, Nicholas D. Sidiropoulos, Mingyi Hong

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach. Comprehensive Experiments and Validation: We provide a set of synthetic-data experiments and validate the method on different real datasets including various document and image copora.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis MN 55455, USA. 2Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA 50011, USA.
Pseudocode Yes Algorithm 1 Alternating SGD
Open Source Code Yes Reproducibility: The code for the experiments is available at https://github.com/boyangumn/DCN.
Open Datasets Yes RCV1-v2 corpus (Lewis et al., 2004), raw MNIST dataset
Dataset Splits No The paper mentions using well-known datasets like RCV1 and MNIST for experiments but does not explicitly provide the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used in their experiments.
Hardware Specification No The paper mentions 'The GPU used in this work was kindly donated by NVIDIA,' but does not provide specific hardware details such as the exact GPU model, CPU, or memory specifications used for running experiments.
Software Dependencies No The paper states 'We implement DCN using the deep learning toolbox Theano (Theano Development Team, 2016)', but it does not provide specific version numbers for Theano or any other software dependencies like Python or other libraries.
Experiment Setup Yes To avoid unrealistic tuning, for all the experiments, we use a DCN whose forward network has five hidden layers which have 2000, 1000, 1000, 1000, 50 neurons, respectively. The reconstruction network has a mirrored structure. We set λ = 0.1 for balancing the reconstruction error and the clustering regularization.