Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
Authors: Peng Cui, Dan Zhang, Zhijie Deng, Yinpeng Dong, Jun Zhu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments to verify our method’s effectiveness, showing that the proposed method can improve prediction performance and uncertainty quantification simultaneously. |
| Researcher Affiliation | Collaboration | Peng Cui1 5, Dan Zhang2 3, Zhijie Deng4 , Yinpeng Dong1 5, Jun Zhu1 5 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, THBI Lab, Tsinghua-Bosch Joint ML Center, Tsinghua University, Beijing, 100084 China 2Bosch Center for Artificial Intelligence 3 University of Tübingen 4Qing Yuan Research Institute, Shanghai Jiao Tong University 5Real AI |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any links to open-source code or statements about code availability. |
| Open Datasets | Yes | Datasets. CIFAR-10/100 [25], and Image Net1k [7] are used for multi-class classification training and evaluation. |
| Dataset Splits | No | The paper mentions "Image Net1k validation sample" and refers to "validation subsets" in Fig. 2 but does not provide specific split percentages or explicitly state the use of predefined standard splits with formal citation. |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions "SGD with a weight decay of 1e-4 and a momentum of 0.9" which are optimization parameters, not specific software dependencies with version numbers. No other software details are provided. |
| Experiment Setup | Yes | Implementation details. We use standard data augmentation (i.e., random horizontal flipping and cropping) and SGD with a weight decay of 1e-4 and a momentum of 0.9 for classification training, and report averaged results from five random runs. The default image classifier architecture is Res Net34 [15]. For the baselines, we use the same hyper-parameter setting as recommended in [52]. For the hyper-parameters in our training loss (9), we set α as 0.3 and 0.2 for CIFARs and Image Net1k, respectively, where T equals 0.7 for all datasets. |