Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Data Valuation using Reinforcement Learning
Authors: Jinsung Yoon, Sercan Arik, Tomas Pfister
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that DVRL yields superior data value estimates compared to alternative methods across numerous datasets and application scenarios. The corrupted sample discovery performance of DVRL is close to optimal in many regimes (i.e. as if the noisy samples were known apriori), and for domain adaptation and robust learning DVRL significantly outperforms state-of-the-art by 14.6% and 10.8%, respectively. |
| Researcher Affiliation | Industry | 1Google Cloud AI, Sunnyvale, California, USA. |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of DVRL training |
| Open Source Code | Yes | source codes can be found in https://github.com/google-research/ google-research/tree/master/dvrl. |
| Open Datasets | Yes | We consider 12 public datasets (3 tabular datasets, 7 image datasets, and 2 language datasets) to evaluate DVRL in comparison to multiple benchmark methods. 3 tabular datasets are (1) Blog, (2) Adult, (3) Rossmann Store Sales; 7 image datasets are (4) HAM 10000, (5) MNIST, (6) USPS, (7) Flower, (8) Fashion-MNIST, (9) CIFAR-10, (10) CIFAR100; 2 language datasets are (11) Email Spam, (12) SMS Spam. Details can be found in the hyper-links. |
| Dataset Splits | Yes | We assume an availability of a (small) validation dataset Dv = {(xv k, yv k)}L k=1 Pt that comes from the target distribution Pt." and "We use 79% of the data as training, 1% as validation, and 20% as testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., specific GPU or CPU models, or cloud instance types). |
| Software Dependencies | No | The paper mentions software components and models like 'Light GBM', 'XGBoost', and 'Inception-v3', but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Algorithm 1 Pseudo-code of DVRL training: Inputs: Learning rates α, β > 0, mini-batch sizes Bp, Bs > 0, inner iteration count NI > 0, moving average window T > 0, training dataset D, validation dataset Dv = {(xv k, yv k)}L k=1 |