Learning the Latent Causal Structure for Modeling Label Noise
Authors: Yexiong Lin, Yu Yao, Tongliang Liu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we report the experiment results of our method. We first compare the effectiveness of the proposed data-generating process with existing methods. We then compare the estimation error of noise transition matrices with other methods and the classification performance of the proposed method with that of state-of-the-art methods on synthetic and real-world noisy datasets. |
| Researcher Affiliation | Academia | Yexiong Lin Yu Yao Tongliang Liu Sydney AI Centre, The University of Sydney |
| Pseudocode | Yes | Algorithm 1 CSGN |
| Open Source Code | Yes | The code is available at: https://github.com/tmllab/2024_NeurIPS_CSGN. |
| Open Datasets | Yes | We empirically verify the performance of our method on three synthesis datasets, i.e., Fashion-MNIST [51], CIFAR-10 [26], CIFAR-100 [26], and two real-world datasets, i.e., CIFAR-N [48] and Webvision [29]. |
| Dataset Splits | Yes | We follow the previous work [7] to train the model on the first 50 classes of the Google image subset and test the model on the Web Vision validation set and the Image Net ILSVRC12 validation set. |
| Hardware Specification | Yes | We implement our algorithm using PyTorch and conduct experiments on eight RTX-3090 GPUs. |
| Software Dependencies | No | The paper mentions using "PyTorch" but does not specify a version number for it or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | The initial learning rate for SGD was set at 0.02 and for Adam at 0.001. Our networks were trained for 200 epochs with a batch size of 64. Both learning rates were reduced by a factor of 10 after 100 epochs. For experiments on Web Vision, we changed the weight decay of SGD to 0.001. The initial learning rate for SGD was set at 0.04 and for Adam at 0.004. Other parameters of optimizers remain unchanged. Our networks were trained for 80 epochs with a batch size of 16. Both learning rates were reduced by a factor of 10 after 40 epochs. |