Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Simplicity Bias in 1-Hidden Layer Neural Networks

Authors: Depen Morwani, Jatin Batra, Prateek Jain, Praneeth Netrapalli

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that models trained on real datasets such as Imagenet and Waterbirds Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets
Researcher Affiliation Collaboration Depen Morwani Department of Computer Science Harvard University EMAIL Jatin Batra School of Technology and Computer Science Tata Institute of Fundamental Research (TIFR) EMAIL Prateek Jain1 Google Research EMAIL Praneeth Netrapalli1 Google Research EMAIL
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement or link indicating that the authors' source code is open-source or publicly available.
Open Datasets Yes Empirically, we demonstrate LD-SB on three real world datasets: binary and multiclass version of Imagenette (Fast AI, 2021), waterbirds-landbirds (Sagawa et al., 2020a) as well as the Image Net (Deng et al., 2009) dataset.
Dataset Splits Yes For each of the runs, we tune the batch size, learning rate and weight decay using validation accuracy.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. It mentions using "Imagenet pretrained Resnet-50 models" but not the hardware for training or inference.
Software Dependencies No The paper mentions general software components like "SGD" and using a "Resnet-50" model, but does not specify any programming languages, libraries, or frameworks with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes Every model is trained for 20000 (100000 for Imagenet) steps with a warmup and cosine decay learning rate scheduler. For each of the runs, we tune the batch size, learning rate and weight decay using validation accuracy. Below are the hyperparameter tuning details: Batch size {128, 256} Learning rate: Rich regime: {0.5, 1.0} (for imagenet, {5.0, 10.0} as learning rate in rich regime needs to scale up with the hidden dimension) Lazy regime: {0.01, 0.05} Weight decay: {0, 1e 4}