Polarity Is All You Need to Learn and Transfer Faster
Authors: Qingyang Wang, Michael Alan Powell, Eric W Bridgeford, Ali Geisa, Joshua T Vogelstein
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate with simulation and image classification tasks that if weight polarities are adequately set a priori, then networks learn with less time and data. We experimentally show that if the weight polarities are adequately set a priori, then networks can learn with less time and data (simulated task (Section 2) + two image classification tasks (Section 3)). |
| Researcher Affiliation | Academia | 1Department of Neuroscience, Johns Hopkins University, Baltimore MD US 2Center for Imaging Science, Johns Hopkins University, Baltimore MD US 3Department of Biomedical Engineering, Johns Hopkins University, Baltimore MD US 4Current location: United States Military Academy, Department of Mathematical Sciences, West Point NY US 5Department of Biostatistics, Johns Hopkins University, Baltimore MD US 6Institute for Computational Medicine, Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore MD US. |
| Pseudocode | Yes | Algorithm 1 Freeze-SGD |
| Open Source Code | Yes | Code for running all experiments and analysis may be found on Git Hub. |
| Open Datasets | Yes | We trained and tested networks on the Fashion-MNIST (grayscale) and CIFAR-10 (RGB-color) datasets, using Alex Net network architecture (Krizhevsky et al., 2017). For XOR-5D: "XOR-5D Data were prepared by sampling from the XOR-5D distribution as described in the main text". Additionally, "All data presented in the paper may be found here." |
| Dataset Splits | Yes | XOR-5D Data were prepared by sampling from the XOR5D distribution as described in the main text, the training data size varied, validation set is always 1000 samples across all scenarios. Computational efficiency was quantified by plotting the number of epochs to reach certain level of validation accuracy. |
| Hardware Specification | Yes | All experiments were run on 2 RTX-8000 GPUs. |
| Software Dependencies | No | The paper mentions using "Adam optimizer" and "Glorot Normal/Uniform" for initialization, and frameworks like AlexNet, but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For both datasets, we trained for 100 epochs, with lr=0.001 (best out of [0.1,0.03,0.01,0.001]). Specifically, we randomly initialized the networks following conventional procedures: we used Glorot Normal for conv layers and Glorot Uniform for fc layers. |