reproducibilityindex.ai

Unsupervised Deep Embedding for Clustering Analysis

Authors: Junyuan Xie, Ross Girshick, Ali Farhadi

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods.
Researcher Affiliation	Collaboration	Junyuan Xie JXIE@CS.WASHINGTON.EDU University of Washington Ross Girshick RBG@FB.COM Facebook AI Research (FAIR) Ali Farhadi ALI@CS.WASHINGTON.EDU University of Washington
Pseudocode	No	The paper describes the optimization process and various steps mathematically and in paragraph form but does not contain a structured pseudocode or algorithm block.
Open Source Code	Yes	Our Caffe-based (Jia et al., 2014) implementation of DEC is available at https://github.com/piiswrong/dec.
Open Datasets	Yes	Dataset statistics: MNIST (Le Cun et al., 1998) 70000 points, STL-10 (Coates et al., 2011) 13000 points, REUTERS-10K 10000 points, REUTERS (Lewis et al., 2004) 685071 points.
Dataset Splits	No	Determining hyperparameters by cross-validation on a validation set is not an option in unsupervised clustering.
Hardware Specification	No	The paper mentions 'half an hour with GPU acceleration' but does not specify any particular GPU model, CPU, memory, or other hardware details used for the experiments.
Software Dependencies	No	The paper mentions 'Our Caffe-based (Jia et al., 2014) implementation' but does not provide specific version numbers for Caffe or any other software dependencies.
Experiment Setup	Yes	we set network dimensions to d 500 500 2000 10 for all datasets... Each layer is pretrained for 50000 iterations with a dropout rate of 20%. The entire deep autoencoder is further ﬁnetuned for 100000 iterations without dropout. For both layer-wise pretraining and end-to-end ﬁnetuning of the autoencoder the minibatch size is set to 256, starting learning rate is set to 0.1, which is divided by 10 every 20000 iterations, and weight decay is set to 0. In the KL divergence minimization phase, we train with a constant learning rate of 0.01. The convergence threshold is set to tol = 0.1%.