reproducibilityindex.ai

Computerized Adaptive Testing via Collaborative Ranking

Authors: Zirui Liu, Yan Zhuang, Qi Liu, Jiatong Li, Yuren Zhang, Zhenya Huang, Jinze Wu, Shijin Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By using collaborative students as anchors to assist in ranking test-takers, CCAT can give both theoretical guarantees and experimental validation for ensuring ranking consistency.
Researcher Affiliation	Collaboration	Zirui Liu1, Yan Zhuang1, Qi Liu1,2 , Jiatong Li1, Yuren Zhang1, Zhenya Huang1, Jinze Wu3, Shijin Wang3 1: State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China 2: Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 3: i FLYTEK Co., Ltd
Pseudocode	Yes	Algorithm 1: The CCAT framework
Open Source Code	Yes	The code can be found in the github: https://github.com/bigdata-ustc/CCAT.
Open Datasets	Yes	We individually conduct experiments on two educational benchmark datasets, NIPS-EDU and JUNYI. NIPS-EDU [50] is a dataset compiled from student question interactions collected from Eedi and used in the Neur IPS 2020 Educational Challenge. JUNYI [51] is sourced from junyiacademy.org, providing millions of response data from students enrolled in a course between 2018 and 2019.
Dataset Splits	Yes	We filter out students who answer less than 50 times and questions that are answered less than 50 times in the following experiment and then divide the dataset into a training dataset (Collaborative Students) and a testing dataset (Tested Students) in a 4:1 ratio.
Hardware Specification	No	No specific hardware (GPU/CPU models, memory) used for experiments is explicitly mentioned in the paper.
Software Dependencies	No	In addition, this article is based on theoretical derivation, so there are no technical details such as hyperparameters, optimizers, etc.
Experiment Setup	No	In addition, this article is based on theoretical derivation, so there are no technical details such as hyperparameters, optimizers, etc.