reproducibilityindex.ai

Calibrating a Deep Neural Network with Its Predecessors

Authors: Linwei Tao, Minjing Dong, Daochang Liu, Changming Sun, Chang Xu

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on various datasets, including CIFAR-10/100 [Krizhevsky, 2012] and Tiny Image Net [Deng et al., 2009] to evaluate the calibration performance.
Researcher Affiliation	Academia	1School of Computer Science, Faculty of Engineering, University of Sydney, Australia 2CSIRO’s Data61, Australia
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	Supplementary material and code are available at https://github.com/Linwei94/PCS
Open Datasets	Yes	We conduct experiments on various datasets, including CIFAR-10/100 [Krizhevsky, 2012] and Tiny Image Net [Deng et al., 2009] to evaluate the calibration performance.
Dataset Splits	Yes	We follow the same training and validation set spilt setting as [Mukhoti et al., 2020]. The learning rate is set to 0.1 for epoch 0 to 150, 0.01 for 150 to 250, and 0.001 for 250 until the end of training. For training on Tiny-Image Net, we set Ttrain = 100.
Hardware Specification	Yes	All experiments are conducted on a single Tesla V-100 GPU with all random seeds set to 1.
Software Dependencies	No	The paper mentions 'Our code and results of comparison method are based on the public code and the pre-trained weight provided by [Mukhoti et al., 2020]' but does not provide specific version numbers for software dependencies like Python or PyTorch.
Experiment Setup	Yes	For training on CIFAR-10/100, we set Ttrain = 350. The learning rate is set to 0.1 for epoch 0 to 150, 0.01 for 150 to 250, and 0.001 for 250 until the end of training... The fine-tuning learning rate is set to 10^-4 for CIFAR-10, 5x10^-4 for CIFAR-100, and 10^-3 for Tiny-Image Net. The searching process is performed with Tse = 100 steps. The population size is S = 100... All networks are optimized using the SGD optimizer with a weight decay at 5x10^-4 and a momentum of 0.9. The training batch size is set to 128.