Robust training with ensemble consensus
Authors: Jisoo Lee, Sae-Young Chung
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we show (i) the effectiveness of three perturbations at removing noisy examples from small-loss examples and (ii) the comparison of LEC and other existing methods under various annotation noises. ... We study random label noise (Goldberger & Ben-Reuven, 2016; Ma et al., 2018), open-set noise (Wang et al., 2018b), and semantic noise. To generate these noises, we use MNIST (Le Cun et al., 1998), CIFAR-10/100 (Krizhevsky et al., 2009) that are commonly used to assess the robustness. |
| Researcher Affiliation | Academia | Jisoo Lee & Sae-Young Chung Korea Advanced Institute of Science and Technology Daejeon, South Korea {jisoolee,schung}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1 LEC; Algorithm 2 LEC-full; Algorithm A1 LNEC; Algorithm A2 LSEC; Algorithm A3 LTEC; Algorithm A4 LTEC-full |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We use MNIST (Le Cun et al., 1998), CIFAR-10/100 (Krizhevsky et al., 2009) that are commonly used to assess the robustness. |
| Dataset Splits | No | The paper mentions that 'a small validation set may be available in reality' but does not provide specific details on the training, validation, and test splits used in their experiments (e.g., percentages or counts). For each benchmark dataset, we only corrupt its training set, while leaving its test set intact for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'Adam' and 'Keras', but does not provide specific version numbers for these or other software dependencies used in the experiments. |
| Experiment Setup | Yes | All parameters are trained for 200 epochs with Adam (Kingma & Ba, 2014) with a batch size of 128. The initial learning rate α is set to 0.1. The learning rate is linearly annealed to zero during the last 120 epochs for MNIST and CIFAR-10, and during the last 100 epochs for CIFAR-100. The momentum parameters β1 and β2 are set to 0.9 and 0.999, respectively. β1 is linearly annealed to 0.1 during the last 120 epochs for MNIST and CIFAR-10, and during the last 100 epochs for CIFAR-100. ... Unless otherwise specified, Tw is set to 10, and M is set to 5 for random label noise and open-set noise, and 10 for semantic noise. |