Resolving Training Biases via Influence-based Data Relabeling
Authors: Shuming Kong, Yanyan Shen, Linpeng Huang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on ten real-world datasets demonstrate RDIA outperforms the state-of-the-art data resampling methods and improves model s robustness against label noise. |
| Researcher Affiliation | Academia | Shuming Kong, Yanyan Shen, Linpeng Huang Department of Computer Science and Engineering Shanghai Jiao Tong University {leinuo123,shenyy,lphuang}@sjtu.edu.cn |
| Pseudocode | Yes | The algorithm of RDIA could be found in Appendix A |
| Open Source Code | Yes | Our code could be found in the https://github.com/Viperccc/RDIA. |
| Open Datasets | Yes | All the datasets could be found in https://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/. |
| Dataset Splits | Yes | When training logistic regression, we randomly pick up 30% samples from the training set as the validation set. For different influence-based approaches, the training/validation/test sets are kept the same for fair comparison. ...When training deep models, due to the high time complexity of estimating influence functions, we randomly exclude 100 samples (1%) from the test sets of MNIST and CIFAR10 as the respective validation sets, and the remaining data is used for testing. |
| Hardware Specification | Yes | We implemented all the comparison methods by using their published source codes in Pytorch and ran all the experiments on a server with 2 Intel Xeon 1.7GHz CPUs, 128 GB of RAM and a single NVIDIA 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions 'Pytorch' in Section 5.1 but does not provide a specific version number. It also mentions optimizers like 'Adam optimizer' and 'SGD optimizer' and algorithms like 'Newton-CG algorithm' and 'Stochastic estimation', but not specific software libraries with version numbers. |
| Experiment Setup | Yes | For logistic regression model, we select the regularization term C = 0.1 for fair comparison. We adopt the Adam optimizer with the learning rate of 0.001 to train the Le Net on MNIST. After calculating the influence functions and relabeling the identified harmful training samples using R, we reduce the learning rate to 10 5 and update the models until convergence. For CIFAR10, we use the SGD optimizer with the learning rate of 0.01 and the momentum of 0.9 to train the CNN. ... The batch size is set to 64 in all the experiments and the hyperparameter α is tuned in [0, 0.001, 0.002, ...,0.01] with the validation set for best performance. |