On the generalization of learning algorithms that do not converge
Authors: Nisha Chandramoorthy, Andreas Loukas, Khashayar Gatmiry, Stefanie Jegelka
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We numerically validate the main ideas of section 3 and 4 on VGG16 and Res Net18 models trained on the CIFAR10 dataset (see Appendix D for further numerical results, [Chandramoorthy and Loukas, 2023] for the code). For all our experiments, ϕS is an SGD update with momentum 0.9, fixed learning rate 0.01 and batch size of 128. In all figures, time indicates number of epochs. We generate different versions of the training set Sp by corrupting CIFAR10 s labels with probability p, with S0 being the original CIFAR10 dataset. Figures 2 and 3 show results corresponding to p = 0, 0.1, 0.17, 0.25 and 0.5. Each line in Figure 3 is a sample mean over 10 random initializations. |
| Researcher Affiliation | Collaboration | Nisha Chandramoorthy Institute for Data, Systems and Society Massachusetts Institute of Technology nishac@mit.edu Andreas Loukas Prescient Design Genentech, Roche andreas.loukas@roche.com Khashayar Gatmiry Electrical Engineering and Computer Science Massachusetts Institute of Technology gatmiry@mit.edu Stefanie Jegelka Electrical Engineering and Computer Science Massachusetts Institute of Technology stefje@mit.edu |
| Pseudocode | No | The paper describes algorithms and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | [Chandramoorthy and Loukas, 2023] for the code. |
| Open Datasets | Yes | We numerically validate the main ideas of section 3 and 4 on VGG16 and Res Net18 models trained on the CIFAR10 dataset |
| Dataset Splits | No | The paper mentions training and testing on the CIFAR10 dataset but does not specify details about validation splits, percentages, or methodology for creating training/validation/test sets. |
| Hardware Specification | No | The paper does not specify the exact hardware used for the experiments, such as specific GPU models, CPU types, or memory configurations. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other library versions). |
| Experiment Setup | Yes | For all our experiments, ϕS is an SGD update with momentum 0.9, fixed learning rate 0.01 and batch size of 128. |