reproducibilityindex.ai

Balanced Data, Imbalanced Spectra: Unveiling Class Disparities with Spectral Imbalance

Authors: Chiraag Kaushik, Ran Liu, Chi-Heng Lin, Amrit Khera, Matthew Y Jin, Wenrui Ma, Vidya Muthukumar, Eva L Dyer

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then study this phenomenon in 11 different state-of-the-art pretrained encoders, and show how our proposed framework can be used to compare the quality of encoders, as well as evaluate and combine data augmentation strategies to mitigate the issue.
Researcher Affiliation	Collaboration	1Georgia Institute of Technology, Georgia, the USA. 2Samsung Research. 3Stanford University, California, the USA.
Pseudocode	No	The paper describes an ensembling method with numbered steps, but these are presented in paragraph format rather than a formal pseudocode or algorithm block.
Open Source Code	Yes	Code can be found at https://github.com/nerdslab/SpectraImbalance.
Open Datasets	Yes	For all experiments, we use the standard Image Net ILSVRC 2012 dataset (Deng et al., 2009), which contains C = 1000 object classes with an average of 1281/50 training/validation images per class.
Dataset Splits	Yes	For all experiments, we use the standard Image Net ILSVRC 2012 dataset (Deng et al., 2009), which contains C = 1000 object classes with an average of 1281/50 training/validation images per class.
Hardware Specification	No	No specific hardware details (e.g., CPU/GPU models, memory) used for running experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions software like Torchvision and timm, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	In each of the three spectral imbalance settings, we apply Theorem 1 with πy = 0.5, the overparameterization ratio δ = 2, regularization parameter r = 0.5 and the loss as the squared hinge loss, L(t) = max(0, 1 t)2. The scalar min-max problem is solved using gradient descent/ascent with learning rate 0.01.