Large-Margin Convex Polytope Machine
Authors: Alex Kantchelian, Michael C Tschantz, Ling Huang, Peter L Bartlett, Anthony D Joseph, J. D. Tygar
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental evaluations of the CPM on large-scale datasets from distinct domains (MNIST handwritten digit recognition, text topic, and web security) demonstrate that the CPM trains models faster, sometimes several orders of magnitude, than state-of-the-art similar approaches and kernel-SVM methods while achieving comparable or better classification performance. |
| Researcher Affiliation | Collaboration | UC Berkeley {akant|mct|bartlett|adj|tygar}@cs.berkeley.edu Datavisor ling.huang@datavisor.com |
| Pseudocode | Yes | Algorithm 1 Stochastic gradient descent algorithm for solving problem (6). Algorithm 2 Heuristic maximum assignment algorithm. The input is the current weight matrix W, positive instance x, and the desired assignment entropy h 0. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We use four data sets to evaluate the CPM: (1) an MNIST dataset consisting of labeled handwritten digits encoded in 28 28 gray scale pictures [11, 12] (60,000 training and 10,000 testing instances); (2) an MNIST8m dataset consisting of 8,100,000 pictures obtained by applying various random deformations to MNIST training instances MNIST [13]; (3) a URL dataset [12] used for malicious URL detection [14] (1.1 million training and 1.1 million testing instances in a very large dimensional space of more than 2.3 million features); and (4) the RCV1-bin dataset [12] corresponding to a binary classification task (separating corporate and economics categories from government and markets categories [15]) defined over the RCV1 dataset of news articles (20,242 training and 677,399 testing instances). |
| Dataset Splits | Yes | All four datasets have well defined training and testing subsets and to tune each algorithms meta-parameters (λ and h for the CPM, C and γ for RBF-SVM, and λ for AMM), we randomly select a fixed validation subset from the training set (10,000 instances for MNIST-2/MNIST8m-2; 1,000 instances for RCV1-bin/URL). |
| Hardware Specification | Yes | Unless stated otherwise, we used one core of an Intel Xeon E5 (3.2Ghz, 64GB RAM) for experiments. |
| Software Dependencies | No | The paper mentions "Lib SVM [16] implementation" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For the CPM, we use a double-sided CPM as described in section 3.1, where both CPMs share the same metaparameters. We start by fixing a number of iterations T and a number of hyperplanes K which will result in a reasonable execution time, effectively treating these parameters as a computational budget, and we experimentally demonstrate that increasing either K or T always results in a decrease of the testing error. Once these are selected, we let h = 0 and select the best λ in {T 1, 10 T 1, . . . , 104 T 1}. We then choose h from {0, log K/10, log 2K/10, . . . , log 9K/10}, effectively performing a one-round coordinate descent on λ, h. |