Learning to Disentangle Factors of Variation with Manifold Interaction

Authors: Scott Reed, Kihyuk Sohn, Yuting Zhang, Honglak Lee

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated the performance of our proposed model on several image databases: Flipped MNIST, Toronto Face Database (TFD), CMU Multi-PIE. In Table 1, the dis BM achieves significantly lower error rates than RBMs of each size. We provide a performance comparison to the baseline and other existing models. Table 4 shows a comparison to a standard (second layer) RBM baseline using the same first layer features as our dis BM on Multi-PIE.
Researcher Affiliation Academia Scott Reed REEDSCOT@UMICH.EDU Kihyuk Sohn KIHYUKS@UMICH.EDU Yuting Zhang YUTINGZH@UMICH.EDU Honglak Lee HONGLAK@UMICH.EDU Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing open-source code or a link to a code repository.
Open Datasets Yes Flipped MNIST. For each digit of the MNIST dataset, we randomly flipped all pixels (0 s to 1 s and vice versa) with 50% probability. Toronto Face Database (TFD) (Susskind et al., 2010). CMU Multi-PIE (Gross et al., 2010).
Dataset Splits Yes Flipped MNIST. The dataset consists of 50,000 training images, 10,000 validation images, and 10,000 test images.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory). It only discusses the experimental setup at a model and dataset level.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup Yes We trained linear SVMs on RBM hidden unit and dis BM appearance unit activations for classification. We built a 2-layer model whose first layer is a Gaussian RBM with tiled overlapping receptive fields similar to those used by Ranzato et al. (2011) and the second layer is our proposed dis BM. We used the same model in Section 6.2 for the tasks on Multi-PIE. For pose estimation and emotion recognition, we trained a linear SVM and reported the percent accuracy. For face verification, we used the cosine similarity as a score for the image pair and report the AUROC. Both numbers are averaged over 5 folds.