Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal Strategic Learning with Competitive Selection

Authors: Kiet Q. H. Vo, Muneeb Aadil, Siu Lun Chau, Krikamol Muandet

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Lastly, we complement our theoretical results with simulation studies. Our results highlight not only the importance of causal modeling as a strategy to mitigate the effect of gaming, as suggested by previous work, but also the need of a benevolent regulator to enable it.
Researcher Affiliation Academia Kiet Q. H. Vo1,2, Muneeb Aadil1,2, Siu Lun Chau1, Krikamol Muandet1 1CISPA Helmholtz Center for Information Security, Saarbr ucken, Germany 2Saarland University, Saarbr ucken, Germany
Pseudocode Yes Algorithm 1: Mean-shift Linear Regression (MSLR)
Open Source Code Yes The code to reproduce our experiments is publicly available.2 https://github.com/muandet-lab/csl-with-selection
Open Datasets No Following Harris et al. (2022), we generate a synthetic college admission dataset.
Dataset Splits No No specific dataset split information (percentages, counts, or methodology for train/validation/test) is provided in the paper.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided in the paper.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes Precisely, the student is admitted into college i if their prediction ห†yit lies within the top ฯ-percentile of all applicants where ฯ [0, 1] and we set ฯ = 0.5. Further discussion of this variant of ranking selection is included in Appendix F.1. As ranking selection (Definition 1) requires access to the distribution p(X t ฮธit), we estimate it by simulating 1000 students in each round.