Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions
Authors: Tian Xie, Xueru Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on (semi-)synthetic and real data validate the theoretical findings. |
| Researcher Affiliation | Academia | Tian Xie Computer Science and Engineering the Ohio State University Columbus, OH 43210 xie.1379@osu.edu Xueru Zhang Computer Science and Engineering the Ohio State University Columbus, OH 43210 zhang.12807@osu.edu |
| Pseudocode | Yes | Algorithm 1 retraining process |
| Open Source Code | Yes | https://github.com/osu-srml/Automating-Data-Annotation-under-Strategic-Human-Agents |
| Open Datasets | Yes | We conduct experiments on two synthetic (Uniform, Gaussian), one semi-synthetic (German Credit [19]), and one real dataset (Credit Approval [20]) to validate the dynamics of at, qt, t and the unfairness 2. [19] Hans Hofmann. Statlog (German Credit Data). UCI Machine Learning Repository, 1994. DOI: https://doi.org/10.24432/C5NC77. [20] Quinlan Quinlan. Credit Approval. UCI Machine Learning Repository, 2017. DOI: https://doi.org/10.24432/C5FS30. |
| Dataset Splits | No | The paper mentions training data and testing (validation) of theoretical findings but does not explicitly state the use of a distinct 'validation' dataset split or specify train/validation/test percentages/counts. |
| Hardware Specification | Yes | Generally, we run all experiments on a Mac Book Pro with Apple M1 Pro chips, memory of 16GB and Python 3.9.13. |
| Software Dependencies | No | The paper mentions 'Python 3.9.13.' and 'SGDClassifier with logloss' but does not list specific version numbers for key libraries or software components (e.g., scikit-learn, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | The decision-maker trains logistic regression models for all experiments using stochastic gradient descent (SGD) over T steps. All experiments are randomized with seed 42 to run n rounds. We use SGDClassifier with logloss to fit models. |