Fast Ensembling with Diffusion Schrödinger Bridge
Authors: Hyunsu Kim, Jongmin Yoon, Juho Lee
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach using three widely adopted image classification benchmark datasets: CIFAR-10, CIFAR-100, and Tiny Image Net (Li et al., 2017). In our experiment, the ensemble models that we construct to serve as bridges adopt the configuration outlined in the Bridge Network (Yun et al., 2023) and are trained based on the Res Net architecture. |
| Researcher Affiliation | Academia | Hyunsu Kim , Jongmin Yoon , Juho Lee Kim Jaechul Graduate School of AI KAIST Daejeon, South Korea {kim.hyunsu,jm.yoon,juholee}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1 Training DBNs Require: An (empirical) data distribution pdata, a temperature distribution ptemp, ensemble parameters {θi}M i=1, and the score network εϕ. Fix a source model fθ1. while not converged do Sample x pdata. for i = 1 to M do Compute the logits zi = fθi(x). end for Get a target ensemble logit Z0 = Ens Logit({zi}M i=1). Draw a temperarture T ptemp and compute the annealed source logit Z1 = z1/T. Compute the loss according to (10), and update ϕ ϕ η ϕL(ϕ). end while return ϕ. |
| Open Source Code | Yes | Our implementation is available at https://github.com/kim-hyunsu/dbn. |
| Open Datasets | Yes | We employ CIFAR-10/100 (Krizhevsky et al., 2009), and Tiny Image Net (Li et al., 2017) datasets for our study. |
| Dataset Splits | Yes | We employ CIFAR-10/100 (Krizhevsky et al., 2009), and Tiny Image Net (Li et al., 2017) datasets for our study. Our data augmentation strategy involves randomly cropping images by 32 pixels with an additional 4-pixel padding, as well as applying random horizontal flipping. Furthermore, we normalize input images by subtracting per-channel means and dividing them by per-channel standard deviations. ... Table 3: The hyperparameter settings used to learn the DBN network. ... Table 4: The hyperparameter settings used to learn the baseline ensemble models. |
| Hardware Specification | Yes | Our research is supported with Cloud TPUs from Google s TPU Research Cloud (TRC). |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) were explicitly stated in the paper. |
| Experiment Setup | Yes | Table 3: The hyperparameter settings used to learn the DBN network. ... Table 4: The hyperparameter settings used to learn the baseline ensemble models. |