Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Noise-Robust Learning from Multiple Unsupervised Sources of Inferred Labels
Authors: Amila Silva, Ling Luo, Shanika Karunasekera, Christopher Leckie8315-8323
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments using nine real-world datasets for three different classification tasks (images, text and graph nodes). Our results show that our approach achieves notable improvements (e.g., 6.4% in accuracy) against state-of-the-art baselines while dealing with both instance-dependent and classconditional noise in inferred label sources. |
| Researcher Affiliation | Academia | School of Computing and Information Systems The University of Melbourne Parkville, Victoria, Australia {amila.silva@student., ling.luo@, karus@, caleckie@}unimelb.edu.au |
| Pseudocode | No | The paper describes its methods through mathematical equations and figures but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or a link to its source code. It only mentions "Supplementary Material (Silva et al. 2021) provides the proofs and more details about the implementation of the loss terms". |
| Open Datasets | Yes | We select three widely-used datasets for each classification task (see Table 1). We randomly choose 75% of each dataset for training and the remaining 25% for testing. |
| Dataset Splits | No | The paper states, "We randomly choose 75% of each dataset for training and the remaining 25% for testing," but does not specify a separate validation split or strategy. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., specific GPU/CPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper mentions techniques and optimizers (e.g., "Adam optimizer"), but does not specify version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | After performing a grid search, we set β to 0.5 (see Fig. 5 (a)).... We adopt the Adam optimizer and set the learning rate and batch size to 0.01 and 128 respectively. |