reproducibilityindex.ai

Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering

Authors: Bo Yang, Xiao Fu, Nicholas D. Sidiropoulos, Mingyi Hong

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach. Comprehensive Experiments and Validation: We provide a set of synthetic-data experiments and validate the method on different real datasets including various document and image copora.
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis MN 55455, USA. 2Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA 50011, USA.
Pseudocode	Yes	Algorithm 1 Alternating SGD
Open Source Code	Yes	Reproducibility: The code for the experiments is available at https://github.com/boyangumn/DCN.
Open Datasets	Yes	RCV1-v2 corpus (Lewis et al., 2004), raw MNIST dataset
Dataset Splits	No	The paper mentions using well-known datasets like RCV1 and MNIST for experiments but does not explicitly provide the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used in their experiments.
Hardware Specification	No	The paper mentions 'The GPU used in this work was kindly donated by NVIDIA,' but does not provide specific hardware details such as the exact GPU model, CPU, or memory specifications used for running experiments.
Software Dependencies	No	The paper states 'We implement DCN using the deep learning toolbox Theano (Theano Development Team, 2016)', but it does not provide specific version numbers for Theano or any other software dependencies like Python or other libraries.
Experiment Setup	Yes	To avoid unrealistic tuning, for all the experiments, we use a DCN whose forward network has ﬁve hidden layers which have 2000, 1000, 1000, 1000, 50 neurons, respectively. The reconstruction network has a mirrored structure. We set λ = 0.1 for balancing the reconstruction error and the clustering regularization.