Large-Margin Convex Polytope Machine

Authors: Alex Kantchelian, Michael C Tschantz, Ling Huang, Peter L Bartlett, Anthony D Joseph, J. D. Tygar

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluations of the CPM on large-scale datasets from distinct domains (MNIST handwritten digit recognition, text topic, and web security) demonstrate that the CPM trains models faster, sometimes several orders of magnitude, than state-of-the-art similar approaches and kernel-SVM methods while achieving comparable or better classification performance.
Researcher Affiliation Collaboration UC Berkeley {akant|mct|bartlett|adj|tygar}@cs.berkeley.edu Datavisor ling.huang@datavisor.com
Pseudocode Yes Algorithm 1 Stochastic gradient descent algorithm for solving problem (6). Algorithm 2 Heuristic maximum assignment algorithm. The input is the current weight matrix W, positive instance x, and the desired assignment entropy h 0.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We use four data sets to evaluate the CPM: (1) an MNIST dataset consisting of labeled handwritten digits encoded in 28 28 gray scale pictures [11, 12] (60,000 training and 10,000 testing instances); (2) an MNIST8m dataset consisting of 8,100,000 pictures obtained by applying various random deformations to MNIST training instances MNIST [13]; (3) a URL dataset [12] used for malicious URL detection [14] (1.1 million training and 1.1 million testing instances in a very large dimensional space of more than 2.3 million features); and (4) the RCV1-bin dataset [12] corresponding to a binary classification task (separating corporate and economics categories from government and markets categories [15]) defined over the RCV1 dataset of news articles (20,242 training and 677,399 testing instances).
Dataset Splits Yes All four datasets have well defined training and testing subsets and to tune each algorithms meta-parameters (λ and h for the CPM, C and γ for RBF-SVM, and λ for AMM), we randomly select a fixed validation subset from the training set (10,000 instances for MNIST-2/MNIST8m-2; 1,000 instances for RCV1-bin/URL).
Hardware Specification Yes Unless stated otherwise, we used one core of an Intel Xeon E5 (3.2Ghz, 64GB RAM) for experiments.
Software Dependencies No The paper mentions "Lib SVM [16] implementation" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For the CPM, we use a double-sided CPM as described in section 3.1, where both CPMs share the same metaparameters. We start by fixing a number of iterations T and a number of hyperplanes K which will result in a reasonable execution time, effectively treating these parameters as a computational budget, and we experimentally demonstrate that increasing either K or T always results in a decrease of the testing error. Once these are selected, we let h = 0 and select the best λ in {T 1, 10 T 1, . . . , 104 T 1}. We then choose h from {0, log K/10, log 2K/10, . . . , log 9K/10}, effectively performing a one-round coordinate descent on λ, h.