Data Valuation using Reinforcement Learning
Authors: Jinsung Yoon, Sercan Arik, Tomas Pfister
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that DVRL yields superior data value estimates compared to alternative methods across numerous datasets and application scenarios. The corrupted sample discovery performance of DVRL is close to optimal in many regimes (i.e. as if the noisy samples were known apriori), and for domain adaptation and robust learning DVRL significantly outperforms state-of-the-art by 14.6% and 10.8%, respectively. |
| Researcher Affiliation | Industry | 1Google Cloud AI, Sunnyvale, California, USA. |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of DVRL training |
| Open Source Code | Yes | source codes can be found in https://github.com/google-research/ google-research/tree/master/dvrl. |
| Open Datasets | Yes | We consider 12 public datasets (3 tabular datasets, 7 image datasets, and 2 language datasets) to evaluate DVRL in comparison to multiple benchmark methods. 3 tabular datasets are (1) Blog, (2) Adult, (3) Rossmann Store Sales; 7 image datasets are (4) HAM 10000, (5) MNIST, (6) USPS, (7) Flower, (8) Fashion-MNIST, (9) CIFAR-10, (10) CIFAR100; 2 language datasets are (11) Email Spam, (12) SMS Spam. Details can be found in the hyper-links. |
| Dataset Splits | Yes | We assume an availability of a (small) validation dataset Dv = {(xv k, yv k)}L k=1 Pt that comes from the target distribution Pt." and "We use 79% of the data as training, 1% as validation, and 20% as testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., specific GPU or CPU models, or cloud instance types). |
| Software Dependencies | No | The paper mentions software components and models like 'Light GBM', 'XGBoost', and 'Inception-v3', but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Algorithm 1 Pseudo-code of DVRL training: Inputs: Learning rates α, β > 0, mini-batch sizes Bp, Bs > 0, inner iteration count NI > 0, moving average window T > 0, training dataset D, validation dataset Dv = {(xv k, yv k)}L k=1 |