reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multiagent Evaluation Mechanisms

Authors: Tal Alon, Magdalen Dobson, Ariel Procaccia, Inbal Talgam-Cohen, Jamie Tucker-Foltz1774-1781

AAAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We consider settings where agents are evaluated based on observed features, and assume they seek to achieve feature values that bring about good evaluations. Our goal is to craft evaluation mechanisms that incentivize the agents to invest effort in desirable actions; a notable application is the design of course grading schemes. Previous work has studied this problem in the case of a single agent. By contrast, we investigate the general, multi-agent model, and provide a complete characterization of its computational complexity.
Researcher Affiliation	Academia	Tal Alon Technion, Israel Magdalen Dobson Carnegie Mellon University, USA Ariel D. Procaccia Carnegie Mellon University, USA Inbal Talgam-Cohen Technion, Israel Jamie Tucker-Foltz University of Cambridge, UK
Pseudocode	Yes	Algorithm 1: An algorithm for Problem 7. Input: An instance of the evaluation problem with a single admissible proﬁle x Output: A linear mechanism β that incentivizes a maximum number of agents to invest effort according to x
Open Source Code	No	The paper mentions that the full version is available at http://procaccia.info, which appears to be a personal academic homepage and not a specific code repository or an explicit statement of code release for the methodology described.
Open Datasets	No	The paper is theoretical and does not report on experiments using datasets. There is no mention of publicly available or open datasets with access information.
Dataset Splits	No	The paper is theoretical and does not report on experiments using datasets, thus no training/validation/test splits are mentioned.
Hardware Specification	No	The paper is theoretical and does not describe experiments that would require hardware specifications.
Software Dependencies	No	The paper is theoretical and does not describe experiments with specific software dependencies and version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings.