Convexified Convolutional Neural Networks
Authors: Yuchen Zhang, Percy Liang, Martin J. Wainwright
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we find that CCNNs achieve competitive or better performance than CNNs trained by backpropagation, SVMs, fully-connected neural networks, stacked denoising auto-encoders, and other baseline methods. |
| Researcher Affiliation | Academia | Yuchen Zhang 1 Percy Liang 1 Martin J. Wainwright 2 1Stanford University, CA, USA 2University of California, Berkeley, CA, USA. |
| Pseudocode | Yes | Algorithm 1 Learning two-layer CCNNs. ... Algorithm 2 Learning multi-layer CCNNs. |
| Open Source Code | Yes | Code and reproducible experiments are available on the Coda Lab platform4. 4http://worksheets.codalab.org/ worksheets/0x1468d91a878044fba86a5446f52aacde/ |
| Open Datasets | Yes | For all datasets, we use 10,000 images for training, 2,000 images for validation and 50,000 images for testing. This 10k/2k/50k partitioning is standard for MNIST variations (Variations MNIST). |
| Dataset Splits | Yes | For all datasets, we use 10,000 images for training, 2,000 images for validation and 50,000 images for testing. This 10k/2k/50k partitioning is standard for MNIST variations (Variations MNIST). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running experiments. |
| Software Dependencies | No | The paper mentions software components and methods like 'Gaussian kernel', 'projected SGD', 'random feature approximation', but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The loss function is chosen as the 10-class logistic loss. We use Gaussian kernel for the CCNN. The feature matrix Z(x) is constructed via random feature approximation (Rahimi & Recht, 2007) with dimension m = 500 for the first convolutional layer and m = 1000 for the second. ... The convex optimization problem is solved by projected SGD with mini-batches of size 50. ... Each convolutional layer is constructed on 5x5 patches with unit stride, followed by 2x2 average pooling. The first and the second convolutional layers contains 16 and 32 filters, respectively. |