Trusting SVM for Piecewise Linear CNNs
Authors: Leonard Berrada, Andrew Zisserman, M. Pawan Kumar
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using the MNIST, CIFAR and Image Net data sets, we show that our approach always improves over the state of the art variants of backpropagation and scales to large data and large network settings. |
| Researcher Affiliation | Academia | Leonard Berrada1, Andrew Zisserman1 and M. Pawan Kumar1,2 1Department of Engineering Science University of Oxford 2Alan Turing Institute {lberrada,az,pawan}@robots.ox.ac.uk |
| Pseudocode | Yes | Algorithm 1 describes the main steps of CCCP. |
| Open Source Code | Yes | our implementation is available at http://github.com/oval-group/pl-cnn |
| Open Datasets | Yes | Using standard network architectures and publicly available data sets, we show that our algorithm provides a boost over the state of the art variants of backpropagation for learning PL-CNNs and we demonstrate scalability of the method. |
| Dataset Splits | Yes | The training data set consists in 60,000 gray scale images of size 28 28 with 10 classes, which we split into 50,000 samples for training and 10,000 for validating. |
| Hardware Specification | Yes | All experiments are conducted on a GPU (Nvidia Titan X) |
| Software Dependencies | No | The paper mentions using 'Theano (Bergstra et al., 2010; Bastien et al., 2012)' but does not specify a version number for Theano or any other software dependencies. |
| Experiment Setup | Yes | The number of epochs is set to 200, 100 and 100 for Adagrad, Adadelta and Adam... The regularization hyperparameter λ and the initial learning rate are chosen by cross-validation. λ is set to 0.001 for all solvers, and the initial learning rates can be found in Appendix C. For LW-SVM, λ is set to the same value as the baseline, and the proximal term µ to µ = 10λ = 0.01. |