Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Minimizing the Maximal Loss: How and Why
Authors: Shai Shalev-Shwartz, Yonatan Wexler
ICML 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we demonstrate several merits of our approach on the well studied problem of face detection. ... We performed two experiments to highlight different properties of our algorithm. The first experiment shows that FOL is much faster than SGD, and this is reflected both in train and test errors. The second experiment compares FOL to the application of Ada Boost on top of the same base learner. |
| Researcher Affiliation | Collaboration | Shai Shalev-Shwartz EMAIL School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel. Yonatan Wexler EMAIL Orcam |
| Pseudocode | Yes | A pseudo-code of the resulting algorithm is given in Section 2.3. ... Focused Online Learning (FOL) ... Tree.initialize(m) ... Tree.sample(1/2) ... Tree.update(it, exp(η ℓ(wt, xit, yit))) |
| Open Source Code | No | The paper does not provide any explicit statements about making the source code available, nor does it include links to a code repository. |
| Open Datasets | No | To create a dataset we downloaded 30k photos from Google images that are tagged with face . ... This yielded 28k positive examples and 246k negative examples. This set was then randomly mixed and split 9 : 1 for train and test sets. The paper describes the creation of a custom dataset but does not provide access information (link, DOI, or citation to a public dataset). |
| Dataset Splits | No | The paper mentions a '9 : 1 for train and test sets' split, but it does not specify a separate validation set or its corresponding split percentages or counts. |
| Hardware Specification | No | The paper mentions training a convolutional neural network using SGD but does not specify any hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions using 'SGD algorithm with Nesterov’s momentum' and 'logistic loss' but does not provide specific software names with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | The parameters we used are a batch size of 64, an ℓ2 regularization of 0.0005, momentum parameter of 0.9, and a learning rate of ηt = 0.01(1 + 0.0001 t) 0.75. We used the logistic loss as a surrogate loss for the classification error. |