LegendreTron: Uprising Proper Multiclass Loss Learning
Authors: Kevin H Lam, Christian Walder, Spiridon Penev, Richard Nock
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Tested on a benchmark of domains with up to 1,000 classes, our experimental results show that our method consistently outperforms the natural multiclass baseline under a t-test at 99% significance on all datasets with greater than 10 classes. |
| Researcher Affiliation | Collaboration | 1School of Mathematics & Statistics, UNSW Sydney, Australia 2Google Research 3ANU College of Engineering, Computing and Cybernetics, The Australian National University, Australia 4UNSW Data Science Hub (u DASH), UNSW Sydney, Australia. |
| Pseudocode | Yes | Algorithm 1 describes LEGENDRETRON in detail. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of the source code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | For the three MNIST-like datasets (Le Cun et al., 2010; Xiao et al., 2017; Clanuwat et al., 2018)... We also compared LEGENDRETRON against multinomial logistic regression on 15 datasets that are publicly available from the LIBSVM library (Chang & Lin, 2011), the UCI machine learning repository (Asuncion & Newman, 2007; Dua & Graff, 2017), and the Statlog project (King et al., 1995). |
| Dataset Splits | No | The paper states: "each run randomly splits the dataset into 80% training and 20% testing sets." While it specifies training and testing splits, it does not explicitly mention a validation split or provide details for one. |
| Hardware Specification | Yes | Average GPU run times on a P100 for MNIST experiments in Table 2, were 2.32 and 2.12 hours for VGGTRON and VGG respectively. |
| Software Dependencies | No | The paper states: "All experiments were performed using Py Torch (Paszke et al., 2019)". While PyTorch is mentioned and cited, a specific version number for PyTorch itself, or other key software dependencies like Python or CUDA, is not provided. |
| Experiment Setup | Yes | For our experiments, we set softmax+ as the squashing function u for both LEGENDRETRON and multinomial logistic regression. We defer the full experimental details to Appendix I. Appendix I.1 includes a table "Network Architecture and Optimisation Details" specifying learning rate (α), weight decay (λ), multiplicative rate of decay (γ), epochs, and batch size for different datasets. |