Data Valuation using Reinforcement Learning

Authors: Jinsung Yoon, Sercan Arik, Tomas Pfister

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that DVRL yields superior data value estimates compared to alternative methods across numerous datasets and application scenarios. The corrupted sample discovery performance of DVRL is close to optimal in many regimes (i.e. as if the noisy samples were known apriori), and for domain adaptation and robust learning DVRL significantly outperforms state-of-the-art by 14.6% and 10.8%, respectively.
Researcher Affiliation Industry 1Google Cloud AI, Sunnyvale, California, USA.
Pseudocode Yes Algorithm 1 Pseudo-code of DVRL training
Open Source Code Yes source codes can be found in https://github.com/google-research/ google-research/tree/master/dvrl.
Open Datasets Yes We consider 12 public datasets (3 tabular datasets, 7 image datasets, and 2 language datasets) to evaluate DVRL in comparison to multiple benchmark methods. 3 tabular datasets are (1) Blog, (2) Adult, (3) Rossmann Store Sales; 7 image datasets are (4) HAM 10000, (5) MNIST, (6) USPS, (7) Flower, (8) Fashion-MNIST, (9) CIFAR-10, (10) CIFAR100; 2 language datasets are (11) Email Spam, (12) SMS Spam. Details can be found in the hyper-links.
Dataset Splits Yes We assume an availability of a (small) validation dataset Dv = {(xv k, yv k)}L k=1 Pt that comes from the target distribution Pt." and "We use 79% of the data as training, 1% as validation, and 20% as testing.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., specific GPU or CPU models, or cloud instance types).
Software Dependencies No The paper mentions software components and models like 'Light GBM', 'XGBoost', and 'Inception-v3', but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Algorithm 1 Pseudo-code of DVRL training: Inputs: Learning rates α, β > 0, mini-batch sizes Bp, Bs > 0, inner iteration count NI > 0, moving average window T > 0, training dataset D, validation dataset Dv = {(xv k, yv k)}L k=1