reproducibilityindex.ai

Learning Latent Space Models with Angular Constraints

Authors: Pengtao Xie, Yuntian Deng, Yi Zhou, Abhimanu Kumar, Yaoliang Yu, James Zou, Eric P. Xing

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experiments, Table 1. Classiﬁcation accuracy (%) on three datasets., Table 2. Phone error rate (%) on the TIMIT test set., Table 3. Classiﬁcation error (%) on CIFAR-10 test set, Table 4. Accuracy (%) on the two QA datasets
Researcher Affiliation	Collaboration	1Machine Learning Department, Carnegie Mellon University 2Petuum Inc. 3School of Engineering and Applied Sciences, Harvard University 4College of Engineering and Computer Science, Syracuse University 5Groupon Inc. 6School of Computer Science, University of Waterloo 7Department of Biomedical Data Science, Stanford University.
Pseudocode	No	The paper describes the algorithmic steps in text (e.g., 'Solve f W', 'Solve v(r) 1 , v(r) 2 ') but does not contain a structured pseudocode or algorithm block labeled as such.
Open Source Code	No	The paper does not provide any explicit statement about concrete access to source code for the methodology described, nor does it include a link to a repository.
Open Datasets	Yes	Scenes-15 (Lazebnik et al., 2006), Caltech256 (Grifﬁn et al., 2007) and UIUC-Sport (Li & Fei-Fei, 2007). The TIMIT dataset... (https://catalog.ldc.upenn.edu/LDC93S1). The CIFAR-10 dataset... (https://www.cs.toronto.edu/~kriz/cifar.html). CNN and Daily Mail (Hermann et al., 2015).
Dataset Splits	Yes	We use 5-fold cross validation to tune τ in {0.3, 0.4, ..., 1} and the number of basis vectors in {50, 100, 200, ..., 500}. We used 5000 training images as the validation set to tune hyperparameters.
Hardware Specification	Yes	Table 5 shows the total runtime time of FNNs on TIMIT and CNNs on CIFAR-10 with a single GTX TITAN X GPU, and the runtime of LSTM networks on the CNN dataset with 2 TITAN X GPUs.
Software Dependencies	No	The paper mentions using the 'Kaldi (Povey et al., 2011) toolkit' and 'Ada Delta (Zeiler, 2012)' but does not provide specific version numbers for these or any other key software dependencies.
Experiment Setup	Yes	The number of hidden layers is 4. Each layer has 1024 hidden units. Stochastic gradient descent (SGD) is used to train the network. The learning rate is set to 0.008. ... depth is set to 28 and the width is set to 10. SGD is used for training, with epoch number 200, initial learning rate 0.1, minibatch size 128, Nesterov momentum 0.9, dropout probability 0.3 and weight decay 0.0005. The learning rate is dropped by 0.2 at 60, 120 and 160 epochs. ... the size of hidden state is set to 100. Optimization is based on Ada Delta (Zeiler, 2012), where the minibatch size and initial learning rate are set to 48 and 0.5. The model is trained for 8 epochs. Dropout (Srivastava et al., 2014) with probability 0.2 is applied.