Optimal Spot-Checking for Improving Evaluation Accuracy of Peer Grading Systems
Authors: Wanyuan Wang, Bo An, Yichuan Jiang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both synthetic and real datasets show significant advantages of the proposed algorithm over existing approaches. |
| Researcher Affiliation | Academia | Wanyuan Wang,1 Bo An,2 Yichuan Jiang1, 1School of Computer Science and Engineering, Southeast University, China 2School of Computer Science and Engineering, Nanyang Technological University, Singapore , {wywang, yjiang}@seu.edu.cn, boan@ntu.edu.sg |
| Pseudocode | Yes | Algorithm 1: Peer-Assignment Pair-Based Spot Checking Algorithm PASC(G,K) |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the methodology described is publicly available or open-source. There is a link to an online appendix for proofs, but not for code. |
| Open Datasets | Yes | Dataset: TREC3 is a collection of topic-document relevance judgements labelled by workers on AMT. This dataset s data structure is similar to PGS s, where each worker (i.e., peer) is asked to judge whether a topic-document (i.e., assignment) is relevant (i.e., good) or not (i.e., bad). This dataset contains 1,977 judgements collected from 763 workers. 3https://sites.google.com/site/treccrowd/ |
| Dataset Splits | No | The paper mentions 'ltra training tasks' for calibrating worker reliability but does not provide explicit details on training, validation, or test dataset splits for the primary evaluation of the proposed algorithm's accuracy. |
| Hardware Specification | Yes | All computations are performed on a 64-bit PC with a dual-core 3.2 GHz CPU and 16 GB memory. |
| Software Dependencies | No | The paper does not provide specific details on software dependencies, such as programming languages, libraries, or frameworks with version numbers, used for implementing the algorithm or running experiments. |
| Experiment Setup | Yes | There are 1000 students and 1000 assignments. For each student i, his diligent reliability follows the Gaussian distribution N(μ, δ2), where μ=0.75 and δ=0.125. We allocate each assignment to l peers randomly. The cost cij and reward rij follow the Uniform distributions U(0, 1) and U(cij, 1). |