Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Importance of Gaussianizing Representations

Authors: Daniel Eftekhari, Vardan Papyan

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments comprehensively demonstrate the effectiveness of normality normalization, in regards to its generalization performance on an array of widely used model and dataset combinations, its strong performance across various common factors of variation such as model width, depth, and training minibatch size, its suitability for usage wherever existing normalization layers are conventionally used, and as a means to improving model robustness to random perturbations.
Researcher Affiliation	Academia	1Department of Computer Science, University of Toronto, Toronto, Canada 2Vector Institute, Toronto, Canada 3Department of Mathematics, University of Toronto, Toronto, Canada. Correspondence to: Daniel Eftekhari <EMAIL>.
Pseudocode	Yes	Algorithm 1 provides a summary of normality normalization.
Open Source Code	Yes	Code is made available at https://github.com/Daniel Eftekhari/normality-normalization.
Open Datasets	Yes	The datasets we used were CIFAR10, CIFAR100 (Krizhevsky, 2009), STL10 (Coates et al., 2011), SVHN (Netzer et al., 2011), Caltech101 (Li et al., 2022), Tiny Image Net (Le & Yang, 2015), Food101 (Bossard et al., 2014), and Image Net (Deng et al., 2009).
Dataset Splits	Yes	For the Caltech101 dataset, each run used a random 90/10% allocation to obtain the training and validation splits respectively. ... For the experiments involving the SVHN dataset, models were trained from random initialization for 200 epochs, with a factor of 10 reduction in learning rate at each 60-epoch interval, and a minibatch size of 32.
Hardware Specification	Yes	Values are obtained using an NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions PyTorch and AdamW optimizer but does not provide specific version numbers for these software components. For example, it states: "We trained our models using the Py Torch (Paszke et al., 2019) machine learning framework."
Experiment Setup	Yes	In all of our experiments involving the Res Net18, Res Net34, and Wide Res Net architectures, stochastic gradient descent (SGD) with learning rate 0.1, weight decay 5 10 4, momentum 0.9, and minibatch size 128 was used. ... The Adam W optimizer (Kingma & Ba, 2015; Loshchilov & Hutter, 2019) with learning rate 1 10 3, weight decay 5 10 2, (β1, β2) = (0.9, 0.999), ϵ = 1 10 8 was used. A noise factor of ξ = 1.0 was used, as preliminary experiments demonstrated increases typically resulted in training instability.