DeRDaVa: Deletion-Robust Data Valuation for Machine Learning
Authors: Xiao Tian, Rachael Hwee Ling Sim, Jue Fan , Bryan Kian Hsiang Low
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also empirically demonstrate the practicality of our solutions. |
| Researcher Affiliation | Academia | Xiao Tian1,2, Rachael Hwee Ling Sim1, Jue Fan1,2, Bryan Kian Hsiang Low1 1 Department of Computer Science, National University of Singapore 2 Department of Mathematics, National University of Singapore {xiao.tian, rachael.sim, jue.fan}@u.nus.edu, lowkh@comp.nus.edu.sg |
| Pseudocode | Yes | The justification and pseudocode for 012-MCMC algorithm are included in App. D.2. |
| Open Source Code | No | The paper does not include any statement or link providing access to the open-source code for the methodology described. |
| Open Datasets | Yes | Our experiments use the following [model-dataset] combinations: [NB-CC] Naive Bayes trained on Credit Card (Yeh and Lien 2009), [NB-Db] Naive Bayes trained on Diabetes (Carrion, Dustin 2022), [NB-Wd] Naive Bayes trained on Wind (Vanschoren, Joaquin 2014), [SVM-Db] Support Vector Machine trained on Diabetes, and [LR-Pm] Logistic Regression trained on Phoneme (Grin, Leo 2022). |
| Dataset Splits | No | While the paper mentions "validation accuracy" in a general definition, it does not specify the explicit training/validation/test splits used for its own experiments, such as percentages or sample counts for a validation set. |
| Hardware Specification | Yes | The experiments are performed on a 64-bit Linux server with 256GB RAM and two Intel Xeon E5-2690 CPUs. |
| Software Dependencies | Yes | We implemented our solutions using Python 3.9.7 with scikit-learn 1.0.2. |
| Experiment Setup | Yes | For all experiments, we used Adam optimizer with learning rate 0.001 and batch size 64. The model training terminates when the validation loss does not improve for 10 epochs or after a maximum of 100 epochs. |