Fixed-Weight Difference Target Propagation
Authors: Tatsukichi Shibuya, Nakamasa Inoue, Rei Kawakami, Ikuro Sato
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | FW-DTP consistently achieves higher test performance than a baseline, the Difference Target Propagation (DTP), on four classification datasets. We also present a novel propagation architecture that explains the exact form of the feedback function of DTP to analyze FW-DTP. Our code is available at https://github.com/Tatsukichi Shibuya/Fixed Weight-Difference-Target-Propagation. |
| Researcher Affiliation | Collaboration | Tatsukichi Shibuya1, Nakamasa Inoue1, Rei Kawakami1, Ikuro Sato1,2 1 Tokyo Institute of Technology 2 Denso IT Laboratory shibuya.t.ad@m.titech.ac.jp, inoue@c.titech.ac.jp, reikawa@sc.e.titech.ac.jp, isato@c.titech.ac.jp |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Tatsukichi Shibuya/Fixed Weight-Difference-Target-Propagation. |
| Open Datasets | Yes | We compared image classification performance of TP (Bengio 2014), DTP (Lee et al. 2015), DRL (Meulemans et al. 2020), L-DRL (Ernoult et al. 2022), and FW-DTP on four datasets: MNIST (Lecun et al. 1998), Fashion-MNIST (F-MNIST) (Xiao, Rasul, and Vollgraf 2017), CIFAR-10 and CIFAR-100 (Krizhevsky and Hinton 2009). |
| Dataset Splits | Yes | For the hyperparameter search, 5,000 samples from the training set are used as the validation set. |
| Hardware Specification | Yes | 4 GPUs (Tesla P100-SXM2-16GB) with 56 CPU cores are used to measure computational time. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies or libraries used for the experiments. |
| Experiment Setup | Yes | Following previous studies (Bartunov et al. 2018; Meulemans et al. 2020), a fully connected network consists of 6 layers each with 256 units was used for MNIST and F-MNIST. Another fully connected network consists of 4 layers each with 1,024 units was used for CIFAR-10/100. The activation function and the optimizer were the same as those used in the experiment of Jacobian. For the hyperparameter search, 5,000 samples from the training set are used as the validation set. For DTP, DRL and L-DRL, the feedback weights are updated five times in each iteration. |