Functional Ensemble Distillation
Authors: Coby Penso, Idan Achituve, Ethan Fetaya
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our method on several tasks and showed that it achieves superior results in both accuracy and uncertainty estimation compared to current approaches. |
| Researcher Affiliation | Academia | Coby Penso Bar-Ilan University, Israel coby.penso24@gmail.com Idan Achituve Bar-Ilan University, Israel idan.achituve@biu.ac.il Ethan Fetaya Bar-Ilan University, Israel ethan.fetaya@biu.ac.il |
| Pseudocode | Yes | Algorithm 1 Generator training |
| Open Source Code | Yes | We will also provide our code for reproducibility https://github.com/cobypenso/ functional_ensemble_distillation. |
| Open Datasets | Yes | We evaluated all methods on CIFAR-10, CIFAR-100 [22], and STL-10 [7] datasets. |
| Dataset Splits | Yes | For all datasets we use train/val/test split. The train/val split with a ratio 80%:20%. |
| Hardware Specification | No | The main paper states that hardware specifications are 'Provided in the supplementary' but does not include them in the main text. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For our method, we performed a standard training procedure with Adam optimizer [19], a learning rate scheduler with fixed milestones at epochs {35, 45, 55, 70, 80}, and a hyperparameter search done over a held-out validation set. ... For CIFAR-10, a mixture of RBF kernels with {2, 10, 20, 50} length scales had the best results. For CIFAR-100, the length scales are {10, 15, 20, 50}, and for STL-10 a length scale of 50 works best. ... Specifically, for the concatenation part, 3 channels of noise and 3 channels of the input are stacked together. For the intermediate noise, the Gaussian noise was added to the features, instead of concatenation, in 5 different places, one after the first convolution layer and the other four after each Block in the Res Net-18 architecture (Figure 2). |