The Entropy Enigma: Success and Failure of Entropy Minimization
Authors: Ori Press, Ravid Shwartz-Ziv, Yann Lecun, Matthias Bethge
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on 23 challenging datasets show that our method sets the So TA with a mean absolute error of 5.75%, an improvement of 29.62% over the previous So TA on this task. |
| Researcher Affiliation | Collaboration | 1University of T ubingen, T ubingen AI Center, Germany 2New York University 3Meta AI, FAIR. |
| Pseudocode | No | No structured pseudocode or algorithm blocks are present in the paper. |
| Open Source Code | Yes | Our code is available at: https://github. com/oripress/Entropy Enigma |
| Open Datasets | Yes | Experiments on 23 challenging datasets show that our method sets the So TA with a mean absolute error of 5.75%, an improvement of 29.62% over the previous So TA on this task. Our chosen datasets encompass a wide spectrum, from various types of noise (IN-C, IN-C, IN-3DCC, CCC) and domain shifts (IN-R, IN-V2, IN-D), to adversarial noises (Patch-IN, BG Challenge, IN-Obfuscations), and even images featuring classes not present in Image Net (NINCO). |
| Dataset Splits | Yes | The training set was replicated seven times, systematically omitting images for which the ground truth label lay somewhere in the pre-trained model’s top-k predictions... The model’s accuracy was then evaluated on the holdout set, with evaluations every ten iterations, spanning a total of 1,000 iterations. |
| Hardware Specification | No | No specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment. |
| Experiment Setup | Yes | We used a Res Net-50 (He et al., 2016)." and "RDumb uses a SGD with a learning rate of 2.5 10 4, and a batch size of 64, and is reset to its pre-trained state every 1,000 iterations." and "E0 = 0.4 ln103." and "α = 0.9." |