Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
CrossSplit: Mitigating Label Noise Memorization through Data Splitting
Authors: Jihye Kim, Aristide Baratin, Yan Zhang, Simon Lacoste-Julien
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on CIFAR-10, CIFAR-100, Tiny Image Net and mini-Web Vision datasets demonstrate that our method can outperform the current state-of-the-art in a wide range of noise ratios. The project page is at https://rlawlgul.github.io/. |
| Researcher Affiliation | Collaboration | 1Samsung Advanced Institute of Technology (SAIT), Suwon, South Korea 2Work done as a visiting researcher at SAIT AI Lab, Montreal, Canada 3SAIT AI Lab, Montreal, Canada 4Mila, Universit e de Montreal, Canada 5Canada CIFAR AI Chair. |
| Pseudocode | Yes | Algorithm 1 Cross Split: Cross-split SSL training based on cross-split label correction |
| Open Source Code | Yes | The project page is at https://rlawlgul.github.io/. |
| Open Datasets | Yes | CIFAR-10/100 datasets (Krizhevsky et al., 2009) each contains 50K training and 10K testing 32 × 32 coloured images. Tiny-Image Net (Le & Yang, 2015) is a subset of the Image Net dataset with 100K 64 × 64 coloured images distributed within 200 classes. Mini-Web Vision (Li et al., 2017a) contains 2.4 million images from websites Google and Flicker and contains many naturally noisy labels. |
| Dataset Splits | Yes | Tiny-Image Net (Le & Yang, 2015) is a subset of the Image Net dataset with 100K 64 × 64 coloured images distributed within 200 classes. Each class has 500 training images, 50 test images and 50 validation images. |
| Hardware Specification | Yes | The following results are on CIFAR-10, reporting seconds / epoch (average of the next 5 epochs after warm-up), run on one RTX8000 GPU. |
| Software Dependencies | No | The paper mentions optimizers (SGD) and learning rate schedulers (Cosine Annealing) but does not provide specific version numbers for software libraries, programming languages, or frameworks used (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | For CIFAR-10 and CIFAR-100, we train each network using stochastic gradient descent (SGD) optimizer with momentum 0.9 and a weight decay of 0.0005. Training is done for 300 epochs with a batch size of 256. We set the initial learning rate as 0.1 and use a a cosine annealing decay (Loshchilov & Hutter, 2017). Just like in (Li et al., 2020; Karim et al., 2022), a warm-up training on the entire dataset is performed for 10 and 30 epochs for CIFAR-10 and CIFAR-100, respectively. |