Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Training deep neural-networks using a noise adaptation layer
Authors: Jacob Goldberger, Ehud Ben-Reuven
ICLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that this approach outperforms previous methods. |
| Researcher Affiliation | Academia | Engineering Faculty, Bar-Ilan University, Ramat-Gan 52900, Israel |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The network we implemented is publicly available 1. 1code available at https://github.com/udibr/noisy_labels |
| Open Datasets | Yes | The MNIST is a database of handwritten digits, which consists of 28 28 images. The dataset has 60k images for training and 10k images for testing. |
| Dataset Splits | Yes | The dataset has 60k images for training and 10k images for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and 'Re LU' activation but does not specify versions for any software libraries, frameworks, or programming languages used. |
| Experiment Setup | Yes | We used a two hidden layer NN comprised of 500 and 300 neurons. The non-linear activation we used was Re LU and we used dropout with parameter 0.5. We trained the network using the Adam optimizer (Kingma & Ba (2014)) with default parameters, which we found to converge more quickly and effectively than SGD. We used a mini-batch size of 256. These settings were kept fixed for all the experiments described below. |