How many classifiers do we need?
Authors: Hyunsuk Kim, Liam Hodgkinson, Ryan Theisen, Michael W. Mahoney
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical findings are supported by empirical results on several image classification tasks with various types of neural networks. |
| Researcher Affiliation | Collaboration | Hyunsuk Kim Department of Statistics University of California, Berkeley hyskim7@berkeley.edu Liam Hodgkinson School of Mathematics and Statistics University of Melbourne, Australia lhodgkinson@unimelb.edu.au Ryan Theisen Harmonic Discovery ryan@harmonicdiscovery.com Michael W. Mahoney ICSI, LBNL, and Dept. of Statistics University of California, Berkeley mmahoney@stat.berkeley.edu |
| Pseudocode | No | The paper contains theoretical proofs and mathematical formulations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is released, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Res Net18 trained on CIFAR-10 [Kri09] with various sets of hyper-parameters... Mobile Net [How17] is trained and tested on the MNIST [Den12] dataset. |
| Dataset Splits | No | The paper mentions training on CIFAR-10 with a train set size of 50,000 and testing on out-of-sample CIFAR-10 and CIFAR-10.1, but does not explicitly provide details about a validation dataset split. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers, such as specific programming language versions or library versions (e.g., PyTorch version). |
| Experiment Setup | Yes | On CIFAR-10 [Kri09] train set with size 50, 000, the following models were trained with 100 epochs, learning rate starting with 0.05. For models trained with learning rate decay, we used learning rate 0.005 after epoch 50, and used 0.0005 after epoch 75. For following models, 5 classifiers are trained for each hyperparameter combination. Five classifiers differ in weight initialization and vary due to the randomized batches used during training. Res Net18, every combination (width, batch size) of Width:4, 8, 16, 32, 64, 128 Batch size: 16, 128, 256, 1024, with learning rate decay Additional batch size of 64, 180, 364 for without learning rate decay |