Measuring Stochastic Data Complexity with Boltzmann Influence Functions
Authors: Nathan Hoyen Ng, Roger Baker Grosse, Marzyeh Ghassemi
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods. |
| Researcher Affiliation | Academia | 1Massachusetts Institute of Technology 2University of Toronto 3Vector Institute. |
| Pseudocode | No | The paper describes the methodology using prose and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or a direct link to the open-source code for the methodology described. |
| Open Datasets | Yes | To verify that IF-COMP can accurately approximate the ground truth p NML parametric complexity on both in-distribution (ID) and out-of-distribution (OOD) samples, we fine-tune a CIFAR-10 (Krizhevsky, 2009) pre-trained Res Net-18 (He et al., 2016) model with the BPBO (12) on 20 random test images each from CIFAR-10, CIFAR-100, and MNIST (Deng, 2012). |
| Dataset Splits | Yes | For both CIFAR-10 and CIFAR-100 datasets we use a Res Net-18 model trained with the standard training procedure detailed in the section above, with early stopping calculated on a clean validation set. |
| Hardware Specification | Yes | All experiments were implemented in Py Torch and were run on single RTX6000 or A40 GPUs. |
| Software Dependencies | No | The paper states that experiments were 'implemented in Py Torch' but does not specify version numbers for PyTorch or any other software libraries used, which is necessary for a reproducible description of software dependencies. |
| Experiment Setup | Yes | CIFAR-10 ensemble models were trained with following standard training procedures using SGD with momentum of 0.9, weight decay of 0.0005, and a learning rate of 0.1 that decays by a factor of 5 at epochs 60, 120, and 160. |