Information-theoretic Generalization Analysis for Expected Calibration Error
Authors: Futoshi Futami, Masahiro Fujisawa
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments using deep learning models show that our bounds are nonvacuous thanks to this information-theoretic generalization analysis approach. |
| Researcher Affiliation | Collaboration | Futoshi Futami Osaka University / RIKEN AIP futami.futoshi.es@osaka-u.ac.jp Masahiro Fujisawa RIKEN AIP masahiro.fujisawa@riken.jp |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. It provides theoretical analyses and mathematical derivations. |
| Open Source Code | Yes | We submitted our source codes through Open Review. |
| Open Datasets | Yes | We further conducted two binary classification tasks on MNIST [25] using a convolutional neural network (CNN) and on CIFAR-10 [21] using Res Net. |
| Dataset Splits | No | The paper describes the use of training and test datasets but does not explicitly mention a validation set split percentage or count for experiments. |
| Hardware Specification | Yes | We used NVIDIA GPUs with 32GB memory (NVIDIA DGX-1 with Tesla V100 and DGX-2) for MNIST (SGLD) and CIFAR-10 experiments. We also used CPU (Apple M1) with 16GB memory for the other experiments. |
| Software Dependencies | No | The paper mentions using 'sklearn.feature_selection.mutual_info_classif function' but does not provide specific version numbers for this or other software libraries or dependencies. It states adapting code from a previous work but does not list its dependencies with versions. |
| Experiment Setup | Yes | Optimizer Adam with 0.001 learning rate and β1 = 0.9 SGLD with 0.004 learning rate (decaying by a factor 0.9 after each 100 iterations) Batch size 128 (for Adam) or 100 (for SGLD) Num. of training samples [75, 250, 1000, 4000] Num. of epochs 200 |