Initial Guessing Bias: How Untrained Networks Favor Some Classes
Authors: Emanuele Francazi, Aurelien Lucchi, Marco Baity-Jesi
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide empirical evidence of the emergence of IGB in a broader range of practical scenarios, including real data, and a wide spectrum of architectures (e.g., CNNs, Res Nets, Vision Transformers), demonstrating the prevalence of IGB. |
| Researcher Affiliation | Academia | 1Physics Department, EPFL, Switzerland 2SIAM Department, Eawag, Switzerland 3Department of Mathematics and Computer Science, University of Basel, Switzerland. Correspondence to: Emanuele Francazi <emanuele.francazi@epfl.ch>. |
| Pseudocode | No | No pseudocode or algorithm blocks are explicitly presented in the paper. The methodology is described through mathematical equations and prose. |
| Open Source Code | Yes | The code used for the experiments presented in this work are available at https://github.com/Emanuele Francazi/IGB-Algorithms. |
| Open Datasets | Yes | CIFAR10 (C10): We use CIFAR10 (https://www.cs.toronto.edu/ kriz/cifar.html) (Krizhevsky et al., 2009) as an example of a real multi-class dataset. CIFAR100 (C100): We use CIFAR100 (https://www.cs.toronto.edu/ kriz/cifar. html) (Krizhevsky et al., 2009) as an example of high cardinality dataset, i.e. a dataset with a big number of classes. MNIST (E&O): We use MNIST (http://yann.lecun.com/exdb/mnist/) (Deng, 2012) to reproduce binary experiments on real data. |
| Dataset Splits | No | The paper uses standard datasets like CIFAR10 and MNIST but does not explicitly provide the specific percentages or sample counts for training, validation, and test splits within its main text. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper does not list specific version numbers for software dependencies such as Python, PyTorch, or CUDA, which would be necessary for full reproducibility. |
| Experiment Setup | No | The paper refers to 'settings proposed in their respective repositories' for the dynamics simulations, but it does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations within the main text. |