Building a stable classifier with the inflated argmax
Authors: Jake Soloff, Rina Barber, Rebecca Willett
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments In this section, we evaluate our proposed pipeline, combining subbagging with the inflated argmax, with deep learning models and on a common benchmark data set.4 Data and models. We use Fashion-MNIST [XRV17], which consists of n = 60, 000 training pairs (Xi, Yi), N = 10, 000 test pairs ( Xj, Yj), and L = 10 classes. For each data point (X, Y ), X is a 28 28 grayscale image that pictures a clothing item, and Y [L] indicates the type of item, e.g., a dress, a coat, etc. The base model we use is a variant of Le Net-5, implemented in Py Torch [PGML19] tutorials as Garment Classifier(). The base algorithm A trains this classifier using 5 epochs of stochastic gradient descent. Methods and evaluation. We compare four methods: |
| Researcher Affiliation | Academia | Jake A. Soloff Department of Statistics University of Chicago Chicago, IL 60637 soloff@uchicago.edu Rina Foygel Barber Department of Statistics University of Chicago Chicago, IL 60637 rina@uchicago.edu Rebecca Willett Departments of Statistics and Computer Science NSF-Simons National Institute for Theory and Mathematics in Biology University of Chicago Chicago, IL 60637 willett@uchicago.edu |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found. The paper describes its methods verbally and mathematically. |
| Open Source Code | Yes | Code to fully reproduce the experiment is available at https://github.com/jake-soloff/stable-argmax-experiments. ... We attach our code in our submission to Open Review, and we will deanonymize the link to the Github repository after the review process. |
| Open Datasets | Yes | We use Fashion-MNIST [XRV17], which consists of n = 60, 000 training pairs (Xi, Yi), N = 10, 000 test pairs ( Xj, Yj), and L = 10 classes. |
| Dataset Splits | No | The paper mentions 60,000 training pairs and 10,000 test pairs, but no explicit validation set split or methodology for it. |
| Hardware Specification | No | Training all of the models for this experiment took a total of four hours on 10 CPUs running in parallel on a single computing cluster. The statement mentions "10 CPUs" and "single computing cluster" but lacks specific models (e.g., Intel Xeon, specific series), memory, or clock speed to be considered a detailed specification. |
| Software Dependencies | No | The base model we use is a variant of Le Net-5, implemented in Py Torch [PGML19] tutorials as Garment Classifier(). PyTorch is mentioned, but no specific version number. |
| Experiment Setup | Yes | The base algorithm A trains this classifier using 5 epochs of stochastic gradient descent. ... The ε-inflated argmax of the base learning algorithm A with tolerance ε = .05. ... The argmax of the subbagged algorithm e Am, with B = 1, 000 bags of size m = n/2. ... and tolerance ε = .05. |