Being Bayesian about Categorical Probability
Authors: Taejong Joo, Uijung Chung, Min-Gwan Seo
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments show effectiveness of being Bayesian about the categorical probability in improving generalization performances, uncertainty estimation, and calibration property. In this section, we show versatility of BM through extensive empirical evaluations. We first verify its improvement of the generalization error in image classification tasks (section 5.1). |
| Researcher Affiliation | Industry | Taejong Joo 1 Uijung Chung 1 Min-Gwan Seo 1 1ESTsoft, Republic of Korea. |
| Pseudocode | No | The paper describes its methodology through mathematical equations and textual descriptions but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | To support reproducibility, we release our code at: https://github.com/ tjoo512/belief-matching-framework. |
| Open Datasets | Yes | We evaluate the generalization performance of BM on CIFAR (Krizhevsky, 2009) with the pre-activation Res Net (He et al., 2016b). We next perform a large-scale experiment using Res Next50 32x4d and Res Next-101 32x8d (Xie et al., 2017) on Image Net (Russakovsky et al., 2015). |
| Dataset Splits | Yes | Table 1. Test classification error rates on CIFAR. Here, we split a train set of 50K examples into a train set of 40K examples and a validation set of 10K example. Image Net contains approximately 1.3M training samples and 50K validation samples |
| Hardware Specification | Yes | We performed all experiments on a single workstation with 8 GPUs (NVIDIA Ge Force RTX 2080 Ti). |
| Software Dependencies | No | The paper mentions using specific models (e.g., Res Net) and optimizers (e.g., Adam) but does not provide specific version numbers for any software libraries or dependencies used in the experiments (e.g., PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | However, we additionally use an initial learning rate warm-up and gradient clipping, which are extremely helpful for stable training of BM. Specifically, we use learning rates of [0.1ϵ, 0.2ϵ, 0.4ϵ, 0.6ϵ, 0.8ϵ] for first five epochs when the reference learning rate is ϵ and clip gradient when its norm exceeds 1.0. |