Experimental Design under the Bradley-Terry Model
Authors: Yuan Guo, Peng Tian, Jayashree Kalpathy-Cramer, Susan Ostmo, J.Peter Campbell, Michael F.Chiang, Deniz Erdogmus, Jennifer Dy, Stratis Ioannidis
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate the performance of these methods over synthetic and real-life datasets. |
| Researcher Affiliation | Academia | 1 ECE Department, Northeastern University, Boston, MA, USA. 2 Department of Radiology, Massachusetts General Hospital, Charlestown, MA, USA. 3 Dept of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA. |
| Pseudocode | Yes | Algorithm 1 Greedy Algorithm |
| Open Source Code | Yes | We make our code publicly available.2 2https://github.com/neu-spiral/Experimental_Design |
| Open Datasets | Yes | ROP Dataset. Our first dataset consists of 100 images of retinas, labeled by experts w.r.t. the presence of a disease called Retinopathy of Prematurity (ROP) [Kalpathy-Cramer et al., 2016]. SUSHI Dataset. The SUSHI Preference dataset [Kamishima et al., 2009] consists of rankings of N = 100 sushi food items by 5000 customers. |
| Dataset Splits | Yes | In each experiment, we partition the dataset N into three datasets: a training set Ntrn, a test set Ntst, and a validation set Nval. ... For each dataset, we perform 3-fold cross validation, repeating the partition to training and test datasets keeping the validation set fixed. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Each of the four algorithms listed above have a hyperparameter that needs to be tuned: σ0 for MI, c for Cov, λe for Ent, and λf for Fisher. We tune these parameters on a validation set, as described in Section 5.3. We run all algorithm with K ranging from 0 to 100, with the exception of MI, that is the most computation intensive: we execute this for K = 0 to 15. |