Learning from Complementary Labels
Authors: Takashi Ishida, Gang Niu, Weihua Hu, Masashi Sugiyama
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we experimentally demonstrate the usefulness of the proposed methods. |
| Researcher Affiliation | Collaboration | 1 Sumitomo Mitsui Asset Management, Tokyo, Japan 2 The University of Tokyo, Tokyo, Japan 3 RIKEN, Tokyo, Japan |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to its own source code or explicitly state that its code for the described methodology is being released. |
| Open Datasets | Yes | We used the MNIST hand-written digit dataset, downloaded from the website of the late Sam Roweis4 (with all patterns standardized to have zero mean and unit variance)... USPS can be downloaded from the website of the late Sam Roweis6, and all other datasets can be downloaded from the UCI machine learning repository7. |
| Dataset Splits | Yes | From the training dataset, we left out 25% of the data for validating hyperparameter based on (8) with the zero-one loss plugged in (9) or (10). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for the experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions "Adam [15] was used for optimization" and "All experiments were carried out with Chainer [30]." However, it does not specify version numbers for these software components, which is necessary for reproducibility. |
| Experiment Setup | Yes | We added an 2-regularization term, with the regularization parameter chosen from {10 4, 10 3, . . . , 104}. Adam [15] was used for optimization with 5,000 iterations, with mini-batch size 100. We reported the test accuracy of the model with the best validation score out of all iterations. ... We used a one-hidden-layer neural network (d-3-1) with rectified linear units (Re LU) [24] as activation functions, and weight decay candidates were chosen from {10 7, 10 4, 10 1}. |