Unsupervised Deep Embedding for Clustering Analysis

Authors: Junyuan Xie, Ross Girshick, Ali Farhadi

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods.
Researcher Affiliation Collaboration Junyuan Xie JXIE@CS.WASHINGTON.EDU University of Washington Ross Girshick RBG@FB.COM Facebook AI Research (FAIR) Ali Farhadi ALI@CS.WASHINGTON.EDU University of Washington
Pseudocode No The paper describes the optimization process and various steps mathematically and in paragraph form but does not contain a structured pseudocode or algorithm block.
Open Source Code Yes Our Caffe-based (Jia et al., 2014) implementation of DEC is available at https://github.com/piiswrong/dec.
Open Datasets Yes Dataset statistics: MNIST (Le Cun et al., 1998) 70000 points, STL-10 (Coates et al., 2011) 13000 points, REUTERS-10K 10000 points, REUTERS (Lewis et al., 2004) 685071 points.
Dataset Splits No Determining hyperparameters by cross-validation on a validation set is not an option in unsupervised clustering.
Hardware Specification No The paper mentions 'half an hour with GPU acceleration' but does not specify any particular GPU model, CPU, memory, or other hardware details used for the experiments.
Software Dependencies No The paper mentions 'Our Caffe-based (Jia et al., 2014) implementation' but does not provide specific version numbers for Caffe or any other software dependencies.
Experiment Setup Yes we set network dimensions to d 500 500 2000 10 for all datasets... Each layer is pretrained for 50000 iterations with a dropout rate of 20%. The entire deep autoencoder is further finetuned for 100000 iterations without dropout. For both layer-wise pretraining and end-to-end finetuning of the autoencoder the minibatch size is set to 256, starting learning rate is set to 0.1, which is divided by 10 every 20000 iterations, and weight decay is set to 0. In the KL divergence minimization phase, we train with a constant learning rate of 0.01. The convergence threshold is set to tol = 0.1%.