reproducibilityindex.ai

Who’s Gaming the System? A Causally-Motivated Approach for Detecting Strategic Adaptation

Authors: Trenton Chang, Lindsay Warrenburg, Sae-Hwan Park, Ravi Parikh, Maggie Makar, Jenna Wiens

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present empirical results in a synthetic data study validating the usage of causal effect estimation for gaming detection and show in a case study of diagnosis coding behavior in the U.S. that our approach highlights features associated with gaming. 5 Empirical results & discussion
Researcher Affiliation	Academia	1University of Michigan 2University of Pennsylvania 3Emory University
Pseudocode	Yes	Figure 4: Pseudocode for causally-motivated gaming detection.
Open Source Code	Yes	Code to replicate our experiments will be made publicly available at https://github.com/MLD3/gaming_detection.
Open Datasets	Yes	Full synthetic data generation details are in Appendix C.1. Our cohort is drawn from a 20% sample of all U.S. Medicare beneficiaries provided to the authors under a data usage agreement with the Center for Medicare & Medicaid Services.
Dataset Splits	Yes	We perform a 7:3 dataset train-test split, training all models on the larger split. All rankings are computed on the test split. Early stopping is performed on a 20% validation split randomly sampled from the training set.
Hardware Specification	Yes	All experiments were run on either one Titan V or V100 GPU using 12.9GB of RAM as managed via a Slurm job submission system. Computing nodes had two 2.10GHz Intel Broadwell (Xeon E5-2620V4) processors each (16 cores total).
Software Dependencies	Yes	All code was written in Python 3.10.4 (license: PSF). All non-causal anomaly detection approaches were implemented using Py OD (license: BSD 2-clause) [56]. All neural networks were implemented in Py Torch 2.2.0 (license: Custom BSD-style 9) [57], using Skorch 0.15.0 (license: BSD 3-clause) [58] as a wrapper. Metrics were computed using both Scikit-Learn 1.3.2 (license: BSD 3-clause) [59] and Scipy 1.11.4 (license: BSD 3-clause) [60]. For the fully synthetic data generation process, CVXPY 1.4.2 (license: Apache 2.0) [61] was used to solve each agent s utility maximization problem, and used in tandem with SCIP 9.0 (pyscipopt 5.0.0; license: Apache 2.0) for the matching approaches (formulated as mixed-integer programs) [62]. Numpy 1.22.3 (license: BSD-style) [63]10 and Pandas 2.0.3 (license: BSD 3-clause) [64] were used for data manipulation. Matplotlib 3.8.2 (empirical results; license: PSF-style)11 and Adobe Illustrator 2023 (overview figures; license: commercial, Named User Licensing 12) were used for figure generation. For the Medicare cohorts, we generated HCC (Hierarchical Condition Categories; used by the Center for Medicare Services) codes from raw diagnosis codes reported in claims data via HCCPy 0.1.9 (license: Apache 2.0)13.
Experiment Setup	Yes	Optimizer: SGD with learning rate 10^-2 and weight decay 10^-3. Learning rate schedule: We reduce the learning rate by a factor of 0.1 after 5 epochs of non-improvement with respect to the validation loss. Training length: A maximum of 1000 epochs, with early stopping (patience: 10 epochs) based on validation loss.