Inductive Two-Layer Modeling with Parametric Bregman Transfer

Authors: Vignesh Ganapathiraman, Zhan Shi, Xinhua Zhang, Yaoliang Yu

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated the proposed inductive training of convexified two-layer model (CVX-IN) by comparing the generalization accuracy with 4 other baselines: FFNN: a two-layer feedforward neural network; Ker-CVX: the kernel-based convex model proposed by Aslan et al. (2014); LOCAL: a model obtained by alternative minimization of the twolayer objective (3); and CVX-TR: our model learned transductively (see below). SVM was not included since it was already shown inferior to Ker-CVX by Aslan et al. (2014). [...] All methods were applied to two different sizes of training and test data (Xtrain and Xtest): 100/100 and 200/200, and the resulting test error, averaged over 10 trials, were presented in Table 1 and 2 respectively.
Researcher Affiliation Academia 1Department of Computer Science, University of Illinois at Chicago, USA 2School of Computer Science, University of Waterloo, Canada.
Pseudocode Yes Algorithm 1: General GCG algorithm [...] Algorithm 2: Solve (6) for T by the GCG algorithm [...] Algorithm 3: Local optimization used by GCG
Open Source Code No No statement regarding open-source code availability or a link to a code repository was found.
Open Datasets Yes We first used smaller datasets including a synthetic XOR dataset and three real world datasets for binary classification: Letter (Lichman, 2013), CIFAR-SM, a binary classification dataset from (Aslan et al., 2013) based on CIFAR100 (Krizhevsky & Hinton, 2009), and G241N (Chapelle). [...] Letter, XOR, and CIFAR-10 (Krizhevsky & Hinton, 2009)
Dataset Splits No The paper mentions applying methods to 'training and test data (Xtrain and Xtest): 100/100 and 200/200' and discusses 'model selection method, e.g. cross validation' generally. However, it does not explicitly provide details about a dedicated validation split or its size/proportion for the reported experiments.
Hardware Specification No The paper mentions training times and resource usage (e.g., 'it took 2.5 hours on CIFAR-10 with 2000 examples and 256 features') but does not specify any hardware details such as CPU model, GPU model, or memory.
Software Dependencies No The paper does not list any specific software dependencies with version numbers. It refers to 'LBFGS' as a solver, but without a version.
Experiment Setup No The paper mentions that 'The weight of both regularization terms can be tuned by any model selection method, e.g. cross validation' and that 'we finely tuned all parameters by backpropagation.' It also references 'warm start' in optimization. However, it does not provide specific hyperparameter values like learning rates, batch sizes, or optimizer choices.