Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Simplicity Bias in 1-Hidden Layer Neural Networks
Authors: Depen Morwani, Jatin Batra, Prateek Jain, Praneeth Netrapalli
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that models trained on real datasets such as Imagenet and Waterbirds Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets |
| Researcher Affiliation | Collaboration | Depen Morwani Department of Computer Science Harvard University EMAIL Jatin Batra School of Technology and Computer Science Tata Institute of Fundamental Research (TIFR) EMAIL Prateek Jain1 Google Research EMAIL Praneeth Netrapalli1 Google Research EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the authors' source code is open-source or publicly available. |
| Open Datasets | Yes | Empirically, we demonstrate LD-SB on three real world datasets: binary and multiclass version of Imagenette (Fast AI, 2021), waterbirds-landbirds (Sagawa et al., 2020a) as well as the Image Net (Deng et al., 2009) dataset. |
| Dataset Splits | Yes | For each of the runs, we tune the batch size, learning rate and weight decay using validation accuracy. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. It mentions using "Imagenet pretrained Resnet-50 models" but not the hardware for training or inference. |
| Software Dependencies | No | The paper mentions general software components like "SGD" and using a "Resnet-50" model, but does not specify any programming languages, libraries, or frameworks with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | Every model is trained for 20000 (100000 for Imagenet) steps with a warmup and cosine decay learning rate scheduler. For each of the runs, we tune the batch size, learning rate and weight decay using validation accuracy. Below are the hyperparameter tuning details: Batch size {128, 256} Learning rate: Rich regime: {0.5, 1.0} (for imagenet, {5.0, 10.0} as learning rate in rich regime needs to scale up with the hidden dimension) Lazy regime: {0.01, 0.05} Weight decay: {0, 1e 4} |