reproducibilityindex.ai

Interpreting and Boosting Dropout from a Game-Theoretic View

Authors: Hao Zhang, Sen Li, YinChao Ma, Mingjie Li, Yichen Xie, Quanshi Zhang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove that dropout can suppress the strength of interactions between input variables of deep neural networks (DNNs). The theoretic proof is also veriﬁed by various experiments.
Researcher Affiliation	Academia	Hao Zhang Shanghai Jiao Tong University 1603023-zh@sjtu.edu.cn Sen Li Sun Yat-sen University lisen6@mail2.sysu.edu.cn Yinchao Ma Huazhong University of Science and Technology u201713506@hust.edu.cn Mingjie Li Shanghai Jiao Tong University limingjie0608@sjtu.edu.cn Yichen Xie Shanghai Jiao Tong University xieyichen@sjtu.edu.cn Quanshi Zhang Shanghai Jiao Tong University zqs1022@sjtu.edu.cn
Pseudocode	No	No pseudocode or algorithm blocks were found.
Open Source Code	No	The paper mentions using and referring to third-party codebases (e.g., pytorch-cifar100, suinleelab) but does not state that the authors are releasing their own code for the described methodology.
Open Datasets	Yes	MNIST (Lecun et al., 1998), Celeb A(Liu et al., 2015), Tiny Image Net (Le & Yang, 2015), CIFAR-10 dataset (Krizhevsky & Hinton, 2009), SST-2 dataset (Socher et al., 2013)
Dataset Splits	No	The paper does not specify exact train/validation/test split percentages or absolute sample counts for each split. It mentions sampling training data but not a clear splitting methodology for reproduction.
Hardware Specification	Yes	We trained Alex Net and VGG-11 using the CIFAR-10 dataset on a GPU of Ge Force GTX-1080Ti.
Software Dependencies	No	The paper mentions 'Py Torch' implicitly through a reference to 'pytorch-cifar100' but does not specify a version number for PyTorch or any other software dependencies with version numbers.
Experiment Setup	Yes	For each DNN, we put the dropout operation and the interaction loss in the low convolutional layer (before the 3rd/5th convolutional layer of the Alex Net/VGGs) and the high fully-connected layer (before the 2nd fully-connected layer), respectively... when we trained DNNs with dropout, we set the dropout rate as 0.5... In this paper, we set α=0.05... Thus, we set the sampling number as 500 in all other experiments in this paper.