Participatory Personalization in Classification
Authors: Hailey Joren, Chirag Nagpal, Katherine A. Heller, Berk Ustun
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a comprehensive empirical study of participatory systems in clinical prediction tasks, benchmarking them with common approaches for personalization and imputation. Our results demonstrate that participatory systems can facilitate and inform consent while improving performance and data use across all groups who report personal data. |
| Researcher Affiliation | Collaboration | Hailey Joren UC San Diego Chirag Nagpal Google Research Katherine Heller Google Research Berk Ustun UC San Diego |
| Pseudocode | Yes | Algorithm 1 Learning Participatory Systems Input: M : {h : X G Y} pool of candidate models Input: Dassign = {(xi, gi, yi)}nassign i=1 assignment dataset Input: Dprune = {(xi, gi, yi)}nprune i=1 pruning dataset 1: T Viable Trees(G, Dassign) |T| = 1 for minimal & flat systems 2: for T T do 3: T Assign Models(T, M, Dassign) assign models 4: T Prune Leaves(T, Dprune) prune models 5: end for Output T, collection of participatory systems |
| Open Source Code | Yes | 4. We provide a Python library to build and evaluate participatory systems. and We include code to reproduce these results in an Python library. |
| Open Datasets | Yes | Each dataset is de-identified and available to the public. The cardio_eicu, cardio_mimic, lungcancer datasets require access to public repositories listed under the references. |
| Dataset Splits | Yes | We split each dataset into a test sample (20% for evaluating out-of-sample performance) and a training sample (80% for training, pruning, assignment, and estimating gains to show users). |
| Hardware Specification | No | The paper does not provide specific hardware details used for running experiments. |
| Software Dependencies | No | The paper mentions providing a 'Python library' but does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper describes varying model classes (logistic regression, random forests) and performance metrics (error rate, AUC) for evaluation but does not provide specific hyperparameters like learning rate, batch size, or optimizer settings for these models. |