JFB: Jacobian-Free Backpropagation for Implicit Networks
Authors: Samy Wu Fung, Howard Heaton, Qiuwei Li, Daniel Mckenzie, Stanley Osher, Wotao Yin6648-6656
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show implicit networks trained with JFB are competitive with feedforward networks and prior implicit networks given the same number of parameters. This section shows the effectiveness of JFB using Py Torch (Paszke et al. 2017). All networks are Res Net-based such that Assumption 0.3 holds. Table 1: Test accuracy of JFB-trained Implicit Res Net compared to Neural ODEs, Augmented NODEs, and MONs |
| Researcher Affiliation | Collaboration | Samy Wu Fung,*1 Howard Heaton,*2 Qiuwei Li,3 Daniel Mc Kenzie,4 Stanley Osher,4 Wotao Yin3 1 Department of Applied Mathematics and Statistics, Colorado School of Mines 2 Typal Research, Typal LLC 3 Alibaba Group (US), Damo Academy 4 Department of Mathematics, University of California, Los Angeles |
| Pseudocode | Yes | Algorithm 1: Implicit Network with Fixed Point Iteration |
| Open Source Code | Yes | All codes can be found on Github: github.com/howardheaton/jacobian free backprop |
| Open Datasets | Yes | We train implicit networks on three benchmark image classification datasets licensed under CC-BY-SA: SVHN (Netzer et al. 2011), MNIST (Le Cun, Cortes, and Burges 2010), and CIFAR-10 (Krizhevsky and Hinton 2009). |
| Dataset Splits | No | The paper mentions using benchmark datasets like SVHN, MNIST, and CIFAR-10, which have well-defined standard splits, but it does not explicitly state the training, validation, and test split percentages or sample counts within the main text for reproducibility. It defers to 'appendix for further details'. |
| Hardware Specification | Yes | All experiments are run on a single NVIDIA TITAN X GPU with 12GB RAM. |
| Software Dependencies | No | This section shows the effectiveness of JFB using Py Torch (Paszke et al. 2017). The paper mentions PyTorch but does not specify its version number or any other software dependencies with versions. |
| Experiment Setup | No | The paper mentions using ResNet-based networks and details the use of the conjugate gradient (CG) method with a maximum number of iterations set to the maximum depth of forward propagation. However, it does not explicitly provide specific hyperparameters like learning rate, batch size, or optimizer settings within the main text, deferring 'further details' to the appendix. |