Deep Learning as a Mixed Convex-Combinatorial Optimization Problem
Authors: Abram L. Friesen, Pedro Domingos
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our algorithm improves classification accuracy in a number of settings, including for Alex Net and Res Net-18 on Image Net, when compared to the straight-through estimator. |
| Researcher Affiliation | Academia | Abram L. Friesen and Pedro Domingos Paul G. Allen School of Computer Science and Engineering University of Washington Seattle, WA 98195, USA {afriesen,pedrod}@cs.washington.edu |
| Pseudocode | Yes | Algorithm 1 Train an ℓ-layer hard-threshold network Y = f(X; W) on dataset D = (X, Tℓ) with feasible target propagation (FTPROP) using loss functions L = {Ld}ℓ d=1. |
| Open Source Code | Yes | Code for the experiments is available at https://github.com/afriesen/ftprop. |
| Open Datasets | Yes | We tested these training methods on the CIFAR-10 (Krizhevsky, 2009) and Image Net (ILSVRC 2012) (Russakovsky et al., 2015) datasets. |
| Dataset Splits | Yes | On CIFAR-10, which has 50K training images and 10K test images divided into 10 classes... On Image Net, a much more challenging dataset with roughly 1.2M training images and 50K validation images divided into 1000 classes. |
| Hardware Specification | Yes | All experiments were performed using Py Torch (http://pytorch.org/). CIFAR-10 experiments with the 4-layer convolutional network were performed on an NVIDIA Titan X. All other experiments were performed on NVIDIA Tesla P100 devices in a DGX-1. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number for it or any other key software components. |
| Experiment Setup | Yes | Adam (Kingma & Ba, 2015) with learning rate 2.5e-4 and weight decay 5e-4 was used to minimize the cross-entropy loss for 300 epochs. The learning rate was decayed by a factor of 0.1 after 200 and 250 epochs. |