Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Carrot and Stick: Eliciting Comparison Data and Beyond
Authors: Yiling Chen, Shi Feng, Fang-Yi Yu
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two real-world datasets further support our theoretical discoveries. |
| Researcher Affiliation | Academia | Yiling Chen Harvard University EMAIL Shi Feng Harvard University EMAIL Fang-Yi Yu George Mason University EMAIL |
| Pseudocode | Yes | Mechanism 1: BPP mechanism for comparison data Input: Let A be a collection of items, E be an admissible assignment, and Λs be agents reports. for agent π N with pair ππ= (ππ’π, ππ£π) = (π, π ) do Find π A and two agents πand πso that ππ= (π , π ) and ππ= (π , π), and pay agent π ππ(Λs) = ππ΅ππ(Λπ π, Λπ π, Λπ π) = Λπ πΛπ π Λπ πΛπ π. (2) |
| Open Source Code | Yes | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The code is uploaded in the supplemental material. |
| Open Datasets | Yes | We test our mechanisms on real-world data (sushi preference dataset [26, 27] and Last.fm dataset [8]). |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits. The experiments focus on evaluating the payment mechanism on real-world data without explicit model training splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. The NeurIPS checklist states 'We believe the computer resources are not relevant to our main contributions.' |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiments. |
| Experiment Setup | Yes | For each agent π, we 1) randomly sample three items π, π , π and two agents π, π, 2) derive agent π s comparison on the first two items (π, π ) from her ranking, (and similarly for agent π s comparison on (π , π ), and agent π s comparison on (π, π )), 3) compute bonus-penalty payment on these three comparisons, 4) repeat the above procedure 100 times and pay agent πwith the average of those 100 trials. |