Understanding weight-magnitude hyperparameters in training binary networks
Authors: Joris Quist, Yunqiang Li, Jan van Gemert
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS We empirically validate our analysis on CIFAR-10, using the Bi Real Net-20 architecture (Liu et al., 2018). |
| Researcher Affiliation | Collaboration | Joris Quist1, Yunqiang Li1,2, Jan van Gemert1 1. Computer Vision Lab, Delft University of technology; 2. Axelera AI |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https: //github.com/jorisquist/Understanding-WM-HP-in-BNNs |
| Open Datasets | Yes | We empirically validate our analysis on CIFAR-10, using the Bi Real Net-20 architecture (Liu et al., 2018). and Imagenet: We follow Liu et al. (2021a): We train for 600K iterations with a batch size of 510. |
| Dataset Splits | No | The paper uses "Validation Accuracy (%)" in its figures but does not explicitly state the training, validation, or test dataset splits (e.g., percentages or sample counts) used for reproduction. It implies standard splits for CIFAR-10 and ImageNet but does not specify them. |
| Hardware Specification | Yes | We trained on 3 NVIDIA A40 GPUs, each A40 has 48 GB of GPU memory, with a batch size of 170 per GPU for as much as ten days. |
| Software Dependencies | No | The paper mentions "NVIDIA DALI dataloader" and "Py Torch dataloader" but does not specify any version numbers for these software components. |
| Experiment Setup | Yes | Unless mentioned otherwise the networks were optimized using SGD for both the real-valued and binary parameters with as hyperparameters: learning rate=0.1, momentum with γ = (1 0.9), weight decay=10 4, batch size=256 and cosine learning rate decay and cosine alpha decay. ... For our filtering-based optimizer we used an alpha of 10 3 with cosine decay and a gamma of 10 1. |