Landmarking Manifolds with Gaussian Processes
Authors: Dawen Liang, John Paisley
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our manifold landmarking algorithm on images, text and music data. We show the test accuracy as a function of the number of landmarks for our manifold landmarking algorithm (ML), random selection (Rand), and active learning (Act5K) in Figure 5. |
| Researcher Affiliation | Academia | Dawen Liang DLIANG@EE.COLUMBIA.EDU John Paisley JPAISLEY@COLUMBIA.EDU Department of Electrical Engineering, Columbia University, New York, NY, USA |
| Pseudocode | Yes | Algorithm 1 Manifold landmarking with GPs |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the release of open-source code for the described methodology. |
| Open Datasets | Yes | We perform experiments on the Million Song Dataset (Bertin-Mahieux et al., 2011) which contains the audio features and metadata (user tagging information from Last.fm) for one million songs. We consider the handwritten digit classification problem on the MNIST dataset (Le Cun et al., 1998), and use a low-dimensional representation from different landmark approaches to evaluate their performance. In Figure 2 we show the first eight landmarks from the Yale faces database. (footnote: http://www.cad.zju.edu.cn/home/dengcai/ Data/Face Data.html) |
| Dataset Splits | Yes | We use 50,000 images for training to learn both the landmarks and to train the classifier. We use 10,000 images as a validation set to select the regularization parameter among λ = {0.001, 0.01, . . . , 1000}, and another 10,000 images for classification testing. For each logistic regression model, we use 5-fold cross-validation to search for the best regularization parameter among λ = {0.001, 0.01, . . . , 1000}. |
| Hardware Specification | No | The paper mentions running experiments 'on a laptop computer' in Figure 4(b), but it does not provide specific hardware details like CPU/GPU models, memory, or other specifications. |
| Software Dependencies | No | The paper mentions using algorithms and models like t-SNE, k-means++, and logistic regression but does not provide specific version numbers for any software dependencies or libraries used for implementation. |
| Experiment Setup | Yes | For all problems we use a step size of ρs = (s0 + s) τ with s0 = 10 and τ = 0.51. The algorithm was robust to changes in these values. We take 1000 steps for each landmark and use batch size |Bs| = 1000 unless noted otherwise. We set the kernel width η = P i ˆσ2 i , where ˆσ2 i is an empirical approximation of the variance of the ith dimension of the data. To initialize each landmark, we draw from a Gaussian with the empirical mean and diagonal covariance of the data. |