Long-Tailed Learning Requires Feature Learning
Authors: Thomas Laurent, James von Brecht, Xavier Bresson
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6, we investigate empirically a few questions that we couldn t resolve analytically. In particular, our error bounds are restricted to the case in which a nearest neighbor classification rule is applied on the top of the features we provide empirical evidence in this last section that replacing the nearest neighbor classifier by a linear classifier leads to very minimal improvement. |
| Researcher Affiliation | Academia | 1 Loyola Marymount University, tlaurent@lmu.edu 2 National University of Singapore, xaviercs@nus.edu.sg |
| Pseudocode | No | The paper describes the neural network architecture textually and provides a diagram (Figure 2), but it does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Codes are available at https://github.com/xbresson/Long_Tailed_Learning_Requires_Feature_Learning. |
| Open Datasets | No | The paper uses a custom-designed data model (described in Section 2) to generate synthetic data for its experiments. It does not use or provide access information for a pre-existing public dataset. |
| Dataset Splits | No | The paper specifies the generation of "A training set containing R nspl sentences" and "A test set containing 10,000 unfamiliar sentences" but does not mention a separate validation set or provide details about how the data is split for validation. |
| Hardware Specification | No | The paper states "Constructing each of these Gram matrices takes a few days on CPU" but does not specify any particular CPU model, GPU models, memory, or other detailed hardware specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using "SVC function of Scikit-learn Pedregosa et al. (2011), which itself relies on the LIBSVM library Chang & Lin (2011)". While it names the software, it does not provide specific version numbers for Scikit-learn or LIBSVM that were used in their experiments. |
| Experiment Setup | Yes | MLP 1: din = 150, dhidden = 500, dout = 10, MLP 2: din = 90, dhidden = 2000, dout = 1000... The learning rate is set to 0.01 (constant learning rate), and the batch size to 100... We chose C = 1... the parameter γ involved in the definition of the kernel was set to γ = 0.25 when n {1, 2} and to γ = 0.1 when n {3, 4, 5}. |