Traversing Between Modes in Function Space for Fast Ensembling
Authors: Eunggu Yun, Hyungi Lee, Giung Nam, Juho Lee
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate that we can indeed train such bridge networks and significantly reduce inference costs with the help of bridge networks. |
| Researcher Affiliation | Collaboration | 1Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea 2Saige Research, Seoul, Korea 3AITRICS, Seoul, Korea. |
| Pseudocode | Yes | Algorithm 1 Training type I bridge networks; Algorithm 2 Training type II bridge networks |
| Open Source Code | Yes | We release the code used in the experiments on Git Hub1. 1https://github.com/yuneg11/Bridge-Network |
| Open Datasets | Yes | We evaluate the proposed bridge networks on various image classification benchmarks, including CIFAR-10, CIFAR-100, Tiny Image Net, and Image Net datasets. |
| Dataset Splits | Yes | Specifically, (1) we first find the optimal temperature which minimizes the NLL over the validation examples, and (2) compute uncertainty metrics including NLL, BS, and ECE using temperature scaled predicted probabilities under the optimal temperature. |
| Hardware Specification | Yes | We conduct Tiny Image Net experiments on TPU v2-8 and TPU v3-8 supported by TPU Research Cloud2 and the others on NVIDIA Ge Force RTX 3090. |
| Software Dependencies | Yes | We implemented the experimental codes using Py Torch (Paszke et al., 2019). |
| Experiment Setup | Yes | For detailed training settings, including bridge network architectures or hyperparameter settings, please refer to Appendix A.4. ... We train base Res Net networks for 200 epochs with learning rate 0.1. We use the SGD optimizer with momentum 0.9 and adjust learning rate with simple cosine scheduler. We give weight decay 0.001 for CIFAR-10 dataset, 0.0005 for CIFAR-100 and Tiny Image Net dataset, and 0.0001 for Image Net dataset. |