Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces
Authors: Ankit Singh Rawat, Aditya K Menon, Wittawat Jitkrittum, Sadeep Jayasumana, Felix Yu, Sashank Reddi, Sanjiv Kumar
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify our findings on long-tail classification and retrieval benchmarks. and 5. Experiments We now present experiments on benchmarks for both long-tail learning and retrieval, illustrating our main finding: existing negative sampling schemes, such as within-batch sampling with constant weighting, implicitly trade-off performance on dominant versus rare labels. |
| Researcher Affiliation | Industry | 1Google Research, New York, USA. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link to the source code for the described methodology. |
| Open Datasets | Yes | We present results on long-tailed ( LT ) versions of the CIFAR-100 and Image Net datasets. and In particular, we experiment with AMAZONCAT-13K and WIKILSHTC-325K datasets from the extreme classification literature (Agrawal et al., 2013; Bengio et al., 2019), where due to a large number of labels it is common to employ negative sampling. In addition, we also explored a small scale dataset DELICIOUS from the repository to make our conclusions more general. |
| Dataset Splits | No | The paper mentions training on datasets and evaluating on a test set, but does not explicitly provide the training/validation/test split percentages or counts for all datasets in the main text. It states 'We report the test set balanced error,' which indicates a test set, but a clear validation split is not specified. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions training models (e.g., ResNet) but does not provide specific version numbers for software dependencies or libraries used. |
| Experiment Setup | Yes | We use m = 32 negatives on CIFAR-100, and m = 512 negatives on Image Net. and We train a Res Net-56 for CIFAR and a Res Net-50 for Image Net, using SGD with momentum; see Appendix E for details. |