Stacked Similarity-Aware Autoencoders

Authors: Wenqing Chu, Deng Cai

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on several benchmark datasets show the remarkable performance improvement of the proposed algorithm compared with other autoencoder based approaches.
Researcher Affiliation Academia State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, China
Pseudocode Yes Algorithm 1 Stacked similarity-aware autoencoders
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes We conduct experiments on two widely used datasets to evaluate our proposed SSA-AE for unsupervised feature learning and compare them with several state-of-the-art methods. The following describes the details of the experiments and results. 4.1 Results on COIL100 COIL100 contains 7,200 color images of 100 objects. 4.2 Results on MNIST Furthermore, we evaluate the SSA-AE s performance on the MNIST dataset, a standard digit classification benchmark with a training set of 60,000 labeled images and a test set of 10,000 labeled images.
Dataset Splits Yes COIL100 contains 7,200 color images of 100 objects. The images of each objects were taken 5 degrees apart as the object is rotated on a turntable and each object has 72 images (Figure 3). The converted gray scale images with size 32 32 are used. We selected 10 images per object randomly to form the training set and the rest images are in testing set. The nearest neighbor classifier is applied and the inputs are the learned feature representations. The random split is repeated 10 times, and the average results are reported with standard deviations. MNIST dataset, a standard digit classification benchmark with a training set of 60,000 labeled images and a test set of 10,000 labeled images. Here we use a randomly chosen subset of NL = 100, 600, 1000, and 3000 labeled images and NU = 60K unlabeled images from the training and validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper states, "We use Torch to implement our approach." However, it does not specify any version numbers for Torch or other software components.
Experiment Setup Yes For each single autoencoder, we stacked a combination of convolutional layer, batch normalization layer, ReLU layer and pooling layer to form the encoder. For all the convolutional layers, the number of channels is 50, and filter size is 5 5 with stride = 1 and padding = 0. Except for the experiments which evaluating the stacked autoencoder numbers, to preserve the convolutional feature maps, filter size is 3 3 with stride = 1 and padding = 1. For the decoder, we utilize the deconvolutional layer to obtain an upsampled feature map. The deconvolution operation is exactly the reverse of convolution. On the top of the hidden codes, we append an embedding layer whose dimension is 160. And we adopt stochastic gradient descent (SGD) for optimization.