Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders
Authors: Yi Yu, Yufei Wang, Song Xia, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex Kot
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the remarkable performance of our method across CIFAR-10, CIFAR-100, and a 100-class Image Net-subset. |
| Researcher Affiliation | Collaboration | 1 Rapid-Rich Object Search Lab, Interdisciplinary Graduate Programme, Nanyang Technological University, Singapore 2 School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 3 Peng Cheng Laboratory, Shenzhen, China 4 School of Computer Science and Engineering, Nanyang Technological University, Singapore. |
| Pseudocode | Yes | Algorithm 1 Two-stage purification framework of unlearnable examples with D-VAE |
| Open Source Code | Yes | Code is available at https: //github.com/yuyi-sd/D-VAE. |
| Open Datasets | Yes | We choose three commonly used datasets: CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and a subset of Image Net (Deng et al., 2009) with the first 100 classes. |
| Dataset Splits | No | The paper mentions 'clean training dataset T' and 'clean test dataset D', but does not explicitly detail train/validation/test splits with percentages or sample counts for reproducibility. It also does not mention a separate validation set being used for hyperparameter tuning. |
| Hardware Specification | Yes | It s important to note that the times are recorded using CIFAR-10 as the dataset, Py Torch as the platform, and a single Nvidia RTX 3090 as the GPU. |
| Software Dependencies | No | Py Torch as the platform. (Only 'Py Torch' is mentioned without a version number or other specific software dependencies with versions). |
| Experiment Setup | Yes | For CIFAR-10, we use 60 epochs, while for CIFAR-100 and the Image Net, 100 epochs are allowed. In all experiments, we use SGD optimizer with an initial learning rate of 0.1 and the Cosine Annealing LR scheduler, keeping a consistent batch size of 128. For D-VAE training on unlearnable CIFAR-10, we use a KLD target of 1.0 in the first stage and 3.0 in the second stage, with only a single 0.5 downsampling to preserve image quality. For the CIFAR-100, we maintain the same hyperparameters as CIFAR-10, except for setting kld2 to 4.5. For Image Net-subset, which has higher-resolution images, we employ more substantial downsampling ( 0.125) in the first stage and set a KLD target of 1.5, while the second stage remains the same as with CIFAR. |