Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Themis: A Fair Evaluation Platform for Computer Vision Competitions
Authors: Zinuo Cai, Jianyong Yuan, Yang Hua, Tao Song, Hao Wang, Zhengui Xue, Ningxin Hu, Jonathan Ding, Ruhui Ma, Mohammad Reza Haghighat, Haibing Guan
IJCAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the validity of THEMIS with a wide spectrum of realworld models and datasets. Our experimental results show that THEMIS effectively enforces competition fairness by precluding manual labeling of test sets and preserving the performance ranking of participants models. |
| Researcher Affiliation | Collaboration | 1Shanghai Jiao Tong University 2Queen s University Belfast 3Louisiana State University 4Intel |
| Pseudocode | Yes | Algorithm 1 Training the noise generator |
| Open Source Code | Yes | THEMIS is open-sourced at https://github.com/AISIGSJTU/Themis. |
| Open Datasets | Yes | We select three datasets to evaluate our framework: the UTKFace dataset, the CIFAR-10 and CIFAR-100 datasets. |
| Dataset Splits | Yes | In all experiments, we split them into three parts training sets, validation sets, and test sets with the ration 4:1:1. |
| Hardware Specification | Yes | We implement the code in Py Torch and run the experiment on an NVIDIA virtual machine with 4 Tesla K80 GPU cores. |
| Software Dependencies | No | The paper mentions βPy Torchβ but does not specify a version number or other software dependencies with version information. |
| Experiment Setup | No | The paper describes the general simulation of the training process but does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or optimizer settings. |