Linear-Time Learning on Distributions with Approximate Kernel Embeddings
Authors: Danica Sutherland, Junier Oliva, Barnabás Póczos, Jeff Schneider
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide an analysis of the approximation error in using our proposed random features, and show empirically the quality of our approximation both in estimating a Gram matrix and in solving learning tasks in real-world and synthetic data. |
| Researcher Affiliation | Academia | Dougal J. Sutherland and Junier B. Oliva and Barnab as P oczos and Jeff Schneider Carnegie Mellon University {dsutherl,joliva,bapoczos,schneide}@cs.cmu.edu |
| Pseudocode | Yes | The algorithm for computing features {z(A(pi))}N i=1 for a set of distributions {pi}N i=1, given sample sets {χi}N i=1 where χi = {X(i) j [0, 1]ℓ}ni j=1 iid pi, is thus: 1. Draw M scalars λj iid μ Z and D/2 vectors ωr iid N(0, σ 2I2M|V |), in O(M |V | D) time. 2. For each of the N input distributions i: (a) Compute a kernel density estimate from χi, ˆpi(uj) for each uj in (10), in O(nine) time. (b) Compute ˆA(ˆpi) using a numerical integration estimate as in (10), in O(M |V | ne) time. (c) Get the RKS features, z( ˆA(ˆpi)), in O(M |V | D) time. |
| Open Source Code | No | The paper mentions a GitHub link in footnote 3: 'github.com/dougalsutherland/skl-groups/', but explicitly states it's for the KL kernel ('as did the KL kernel3'), not the authors' main contribution (HDD embeddings). The paper also states: 'while the HDD embeddings used a simple Matlab implementation.', indicating their own code is not openly provided. |
| Open Datasets | Yes | We took the cat and dog classes from the CIFAR-10 dataset (Krizhevsky and Hinton 2009). We consider the Scene-15 dataset (Lazebnik, Schmid, and Ponce 2006). |
| Dataset Splits | Yes | Throughout these experiments we use M = 5, |V | = 10ℓ (selected as rules of thumb; larger values did not improve performance), and use a validation set (10% of the training set) to choose bandwidths for KDE and the RBF kernel as well as model regularization parameters. |
| Hardware Specification | No | No specific hardware details (like GPU or CPU models, or memory specifications) were provided for the experiments. |
| Software Dependencies | No | The paper mentions 'a simple Matlab implementation' for HDD embeddings and that SVM classifiers were used 'from LIBLINEAR (Fan et al. 2008, for the embeddings) or LIBSVM (Chang and Lin 2011, for the KL kernel)', but no specific version numbers for Matlab, LIBLINEAR, or LIBSVM are provided. |
| Experiment Setup | Yes | Throughout these experiments we use M = 5, |V | = 10ℓ (selected as rules of thumb; larger values did not improve performance), and use a validation set (10% of the training set) to choose bandwidths for KDE and the RBF kernel as well as model regularization parameters. Except in the scene classification experiments, the histogram methods used 10 bins per dimension; performance with other values was not better. The KL estimator used the fourth nearest neighbor. ...we use D = 5 000. ...with D = 7 000. |