reproducibilityindex.ai

Landmarking Manifolds with Gaussian Processes

Authors: Dawen Liang, John Paisley

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our manifold landmarking algorithm on images, text and music data. We show the test accuracy as a function of the number of landmarks for our manifold landmarking algorithm (ML), random selection (Rand), and active learning (Act5K) in Figure 5.
Researcher Affiliation	Academia	Dawen Liang DLIANG@EE.COLUMBIA.EDU John Paisley JPAISLEY@COLUMBIA.EDU Department of Electrical Engineering, Columbia University, New York, NY, USA
Pseudocode	Yes	Algorithm 1 Manifold landmarking with GPs
Open Source Code	No	The paper does not provide any explicit statement or link regarding the release of open-source code for the described methodology.
Open Datasets	Yes	We perform experiments on the Million Song Dataset (Bertin-Mahieux et al., 2011) which contains the audio features and metadata (user tagging information from Last.fm) for one million songs. We consider the handwritten digit classiﬁcation problem on the MNIST dataset (Le Cun et al., 1998), and use a low-dimensional representation from different landmark approaches to evaluate their performance. In Figure 2 we show the ﬁrst eight landmarks from the Yale faces database. (footnote: http://www.cad.zju.edu.cn/home/dengcai/ Data/Face Data.html)
Dataset Splits	Yes	We use 50,000 images for training to learn both the landmarks and to train the classiﬁer. We use 10,000 images as a validation set to select the regularization parameter among λ = {0.001, 0.01, . . . , 1000}, and another 10,000 images for classiﬁcation testing. For each logistic regression model, we use 5-fold cross-validation to search for the best regularization parameter among λ = {0.001, 0.01, . . . , 1000}.
Hardware Specification	No	The paper mentions running experiments 'on a laptop computer' in Figure 4(b), but it does not provide specific hardware details like CPU/GPU models, memory, or other specifications.
Software Dependencies	No	The paper mentions using algorithms and models like t-SNE, k-means++, and logistic regression but does not provide specific version numbers for any software dependencies or libraries used for implementation.
Experiment Setup	Yes	For all problems we use a step size of ρs = (s0 + s) τ with s0 = 10 and τ = 0.51. The algorithm was robust to changes in these values. We take 1000 steps for each landmark and use batch size \|Bs\| = 1000 unless noted otherwise. We set the kernel width η = P i ˆσ2 i , where ˆσ2 i is an empirical approximation of the variance of the ith dimension of the data. To initialize each landmark, we draw from a Gaussian with the empirical mean and diagonal covariance of the data.